region问题说大不大说小不小,如果你需要精确的规划物流(或者以后有可能的话)从而降低配送成本,那么region的问题就躲不过。
我国的行政规划编码挺简单,具体编码规则不谈(如果想了解请搜索相关GB开头的文件),某GB文件提到规则中的一条就是作废了的code不会再被是使用,也许有这一条规则就够用一阵子了。行政规划结构很清晰——三级结构,第一层是省及直辖市,第二层一般是“市”、“县”或“xxx市”这样的,第三级是“区”。但根据精简行政规划或者发展的需要,有些地区没有第三级,例如广东省中山市,这样的情况不算多但若要构建一个精确的region表的话,这点知识必不可少。
如果你是一个电商网站,用户只能在你的网站下单,那没问题,但这个时代如果你做的稍微大一些的话,势必要和其他电商网站或平台集成,简单说就是在人家那里开了个店,订单会被导入到你的系统中去(如果不需要导入到你的系统中的话可以不用往下看了,你基本不会遇到region的问题)很可惜现在大家都在拼命收集数据,会员啊、订单啊、用户的个人隐私啊等等,怎么能错过订单这么重要的信息,既然如此,别的平台的订单信息势必会导入到自己系统中来,其中订单中的一个重要的信息就是收货地址。
此时就会遇到一个问题,其他平台的region和我们用的是一样的吗?基本是不一样的,好的developer也许会找一份统计局的数据,但一般的也就从网上搜一个出来,搜到哪个格式看着顺眼的就拿来用了,不对比的话你还看不出大的疏漏,因为省区市的数据变化都不太大。
每年统计局都会更新这个库(除了数据的格式乱七八糟外都还好),有的时候你能看到一次新发布,有的时候你看到的是更新原来的发布时间,但变动都不大,可能有某几个市改名啦、合并啦、降级到区啦、区升级到市啦等等。就是因为改动不会很大所以从网上随便找来一个数据库还真的可以用一阵子,出来混早晚是要还的。
就是由于很多标准不统一,你用A版本的region,我用B版本的,他用C版本的,到底以谁的为准?一般情况下谁都不会改自己的region,因为有历史数据的负担,所以region的集成方式是“不集成”,也就是说直接拼好省市区和详细地址,这样看起来是没问题的,用起来可能也凑合,但这些数据就不再是格式化的了,起码不是符合规范的了,因为有的数据在你这边可能不存在,等你需要用这些数据的时候会很痛苦。
我遇到的问题是,已有系统中的region不知道是哪个版本,和X猫等平台集成的订单没有格式化的省市区数据。更新自己平台的region数据库工作量并不算大,抓取最新数据按照三级的结构存储好,再和已有数据对比,新增、删除(标记删除)、改名就好了,这里用的主键是code,code的编码很简单,前两位是省,中间两位是市,最后的是区,如果是省,那么后面的市区位置都是0,因为我们知道用过的code不会被重新使用,所以可以用作主键。
这样可以解决自己数据与最新数据不一致的问题,但仍无法解决多平台region不一致问题,除了大家使用统一标准外没有轻松解决的办法,这里只能吐槽X宝的网站和手机版,他们使用的region都是不一样的,更有甚者手机版的每个城市下面都能找到一个万能的区——“其他区”,PC版的还好,看起来准确,不过因为他们有对地址更加精细的管理,会添加一些本身并不存在区以方便用户的地址添加?例如某些工业园区、开发区、新区等等,最终的解决只能是我们处理这样的地址,看是归到某个区下还是怎样,总之这样的处理规则很多,一切都是为了数据的真实性。
希望有能力可以让这个世界变得更加美好一点的企业能从这些方面出发做一点让人感受得到的改进,有一批人会感受到专业和幸福。
]]>既然这样我们就直接集成第三方平台发送相应的消息好了,不过App做得实在很遗憾,用户点击消息后总会进入到App的消息中心,再点击某条消息才能看到具体内容,这样做可能是比较快,但体验不太好,至少和现有的主流App做法都不太一样,像上个时代的做法。
以上是背景,基于此我们希望能够把App作为推送会员消息的渠道,因为短信渠道现在变得越来越不可用,仅限于发送验证码等transactional消息,其实以上说的消息都是transactional消息,区别于marketing消息,这些消息一般是由用户自主激活的或满足一定条件触发,最终是区别于用户的,一般不是群发性的,例如营销类信息就属于marketing。将用户的App作为消息推送渠道的前提是能够较准确地获取这个用户的App渠道是否可用,如果不可用还是得选择短信等。
对于不同手机平台使用的技术差不多,但机制不太一样,不过这是有原因的。例如iOS上的推送机制:设备连接APNs,商户后端推送消息给APNs,再由APNs推送到最终用户的设备上;但android在国内不大一样,由于某些特殊的原因,android的GCM服务(类似APNs)无法使用,继而诞生了很多第三方推送服务商,这些服务商一方面做android的推送,另一方面也接管了到APNs的这块,其实接管APNs这块也不错,因为向APNs推送也并不容易实现——socket长连接,二进制消息等都增加了用户的开发成本,对于一般规模的App开发,使用第三方消息服务商几乎是没得选。
iOS的推送:在成功推送到APNs时服务商得不到任何响应,也不保证给你送达,用户不在线可以给你保留一段时间等用户上线了再推送,但长时间不在线的话推送消息也会被删除。除非消息格式有错等语法性的错误会得到响应,再就是如果APNs发现用户设备上没有这个App(被用户删除或被用户禁用了推送)会将该设备的ID记录下来,服务商可以通过APNs提供的反馈接口得到这些信息,也就是说只有再次给App推送消息后才有可能知道这条渠道还是不是可用。
android的推送:目前第三方服务商的做法是做出SDK,让开发App的人集成到App中,这个App安装到系统中后会出现一个Push Service在运行,当然不知道他们哪里来的智慧,如果多个App都用了某一家提供商,并不会启动多个Push Service,一般是一两个,这一两个服务所有App,从而实现了他们所描绘的多App共享长连接很省电什么的。某个设备后台的Push Service会自动和服务商的后端连接,等有消息过来时这些Service会知道该送给哪个App,到了具体的App再由用户的代码处理。由于机制的作用,android的消息服务商是可以实时知道某条消息能不能推送到设备上的,但这毕竟是一个不太常用的功能,所以服务商们一般不会公开这种API,我们用的这家在邮件中才告诉我们是有类似的API,但是VIP级别的,详情需要联系商务部门了解。
]]>sysctl: setting key "net.ipv4.tcp_congestion_control": No such file or directory
, so I followed this post to build a kernel module for ubuntu 14.04
The net/ipv4 hybla Makefile(bad script format in original post):
1 2 3 4 5 6 |
|
I also got a warning during build modules: No X.509 certificates found
, but it doesn’t matter I think, I can build the tcp_hybla.ko
and load in kernel.
Maybe the memory limit results in some primary shards can not be loaded by elasticsearch. If one or all primary shards are unsigned, your elasticsearch cluster will be in red status, if all primary shards are loaded, your cluster status will be yellow, if all shards are loaded, the green status you will be get.
The data is stored in index
in elasticsearch, the index
will be divided into some shards, the shards are primary
and replica
kind, you can have no replica, but you can have no primary. replica shards will be stored on the other nodes in cluster, but in default setting, you have only two nodes, one is data store node, another one is a client node, and the default setting of index is 2 or 3 replicas for each primary shard, you can update all index setting to disabling this config.
I think it’s not a new methodology, but a new concept.
We can think you make an application, used some libraries written by others, these libraries are some independent application, they expose some APIs for you, your application import them and call some APIs to fullfill some functions. Generally they will keep same APIs during update and fix bugs, so you can upgrade them and use latest version. You can also copy their codes to your application code base, just like you write all codes, but if they updated, you must copy new codes to your code base, that likes the development without MSA.
So, MSA is a methodology of dependencies management, the example is a small scope of MSA, the real MSA means some independent application, for instance an order service, an logistics service, etc. We may name these order module, logistics module before, but now we name these xxx service.
]]>so I filter out specified URL rows:
1
|
|
then I got these rows:
1 2 3 4 5 6 7 8 |
|
then I use awk
print the fourth column:
1
|
|
then I got these IPs, I stored these IPs to a file, then sort
and uniq
, the uniq
excepts a sorted file, the -c
option means count these rows:
1
|
|
then I got some unique rows with counts, but I want sort these rows order by counts desc:
1
|
|
the -k1nr
means the KEY starts with first field, and as (n)umber, and ®everse, the -k2
means ends with second field. The field
means the column in your data.
RequestHandler
and mapped these with routers, pass these routers to Application
, run Application
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Tornado supports strings, bytes, and dictionaries as response, you can use Template to building strings and bytes, also you can write response by hand. If you pass a dictionary to response, it will be encoded as JSON.
Tornado will not parse other types request arguments, e.g. Tornado will not parse JSON request bodies, you need to parse these by yourself(override prepare
method)
CSRF/XSRF prevent: the session cookie from SiteA is not expired, there is a form in SiteB, the form aims doing something to SiteA, if SiteA don’t support XSRF prevent, it will allow submitting if you constructed valid form arguments. If SiteA supports XSRF prevent, it will set a cookie with token to user, and that token will filled in any action form(POST,PUT,DELETE), if these two values are not equal, that request will be prevented, because SiteB can’t retrive SiteA’s cookie, so that form can’t submit with valid cookie value.
Due to the Python GIL (Global Interpreter Lock), it is necessary to run multiple Python processes to take full advantage of multi-CPU machines. Typically it is best to run one process per CPU.
]]>Tornado can be roughly divided into four major components:
Tornado uses a signle-threaded event loop.This means that all application code should aim to be asynchronous and non-blocking because only one operation can be active at a time.
There are many styles of asynchronous interfaces:
There is no free way to make a synchronous function asynchronous in a way that is transparent to its callers.
Coroutines are the recommended way to write asynchronous code in Tornado
A function containing yield is a generator. All generators are asynchronous; when called they return a generator object instead of running to completion.
]]>In MacOS, you can use md5
and shasum
to calculate file checksums.
md5 xxx.data
shasum -a 1 xxx.data
shasum -a 256 xxx.data
Thanks for here
]]>kwargs
to an Exception
subclass e.g. class MyException(Exception): pass
, you will get a composed message when you catch the exception and call e.message
. Unitil we got nothing from one place which it should display exception’s message, python language part is implemented by C, so we downloaded python source code(2.7.9), and find these lines.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
This is python BaseException’s C implementation, I can understand logic but C syntax. The BaseException accepts kwargs
, but if you pass a kwargs
to Exception, you will get a warning, because of _PyArg_NoKeywords
, maybe it needs compatibility with old code, but now, you can’t use kwargs
.
Another notice, if you pass one argument to it, this argument will be the e.message
value, if not, your args will be placed in e.args
.
We finally implemented __str__
and __unicode__
methods, because of we imported unicode_literals
, so we think this is the best practice, we just use str(e)
in those places.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Also I found a shadowsocks special version for openwrt, it includs China main IP blocks, some requests within this IP range, you could bypass shadowsocks, and this IP blocks is from apnic
, very precise but a little long for openwrt, 4000+lines.
I think the most counted IP block I would use it in the future very soon, so I can sort them desc.
iptables -t nat -xvnL SS_SPEC_WAN_AC | tail -n+12 | head -n -1 | sort -crnsk 1,1 | awk '{ print $NF }' > /etc/shadowsocks/ignore.list
tail -n+12
means display results from the first 12 row to the end, first 11 rows includes two lines heads, one line shadowsocks server IP, eight lines internal IP blocks;
head -n -1
means display results except last line(doesn’t work on MacOS), and last line is default shadowsocks redirect rule, so keep it at the same place.
sort -crnsk 1,1
means c
heck first, r
everse result, n
umerical value(converted from input string), s
table sort, k 1,1
sort key (start a key at POS1, end it at POS2 (origin 1))
awk print $NF
, print the last column, that’s very cool.
rewrite the results to ignore.list, and restart shadowsocks, it will use the sorted IP blocks, maybe fast than before 0.xx ms!
]]>/etc/postfix/virtual
, like this: dev-group dev1@company.com dev2@company.com dev3@company.com
, after that, generate the virtual db by postmap /etc/postfix/virtual
.
Don’t forget add virtual_alias_maps=hash:/etc/postfix/virtual
in your main.cf
file, and restart postfix finally.
I think I need study postfix systemally, just build a private smtp/imap server.
]]>print
something to stdout
, I got Bad file descriptor
, it means my stdout is a bad descriptor, what?!
Core code like this: logger.StreamHandler(os.fdopen(sys.stdout.fileno(), 'w', 0))
, I want to a unbuffered stdout, so I reopen it by fdopen
, and reset the buffer size to 0, I still also did same thing to stderr
, but that is no sense, because of stderr
is unbuffered normally(but in Python3, it maybe line buffered, in MacOS).
If your stdout
is reopened by fdopen
, and its reference count hits zero, it will be collected by Python GC, also your stdout
will be ate by GC, something wrong will happen after that.
So, be careful with fdopen
for stdin
, stderr
, stdin
:)
地点:北京
ISP:中国联通
接入方式:光纤入户
换联通宽带几个月了,感觉良好,但觉得BT下载不够快,因为下的少,也没深究,而且多用迅雷离线下载,基本就是直连迅雷CDN的速度,都能跑满带宽。
OpenWrt多拨尝试过,没有效果,据说早就被封了,所以不再尝试。
前段时间打算把openwrt的管理界面移到WAN和LAN上,即在外面也能访问路由器,默认我就放到了WAN的80上,很可惜失败了,内部访问没问题,外面不行,当时也没多想,总觉得是iptables有问题,一看openwrt的规则,好可怕,嵌套了几层,不看了,改成了别的端口,比如9999,竟然内外都可以,因为后来很少从外面访问,所以这个需求基本是为需求,不管了。
最近把一个网站放到raspberry pi上,路由器做端口映射到内部的80,当时在内网里测试的没问题,还把https也一并做好了。碰巧在公司访问了一下,80不能访问,但443没问题。
今晚直接在PREROUTING的第一行加了log,从外部服务器请求80直接看不到,但443的没问题,而且PREROUTING是第一个被执行的,只能说80的请求就没到这里,可这都是公网对公网的访问了,难道这个请求被某个路由截获然后丢弃了?Google了一下发现几年前几大运营商的能提供公网IP的服务里80和8080端口被封了,说是不能随意搭建网站,想想也是啊,现在国内都得备案才行。
BUT PORT 443 IS OK…
]]>说明书要按照模版写,模板有下载,看说明就能找到,其中里面有个架构图,这个要仔细画,基本思想(阿里思想)就是把你所有的业务系统都放到ECS上,如果在ECS外出现了一个你们自己的平台,他们就怀疑你用ECS做了一个代理来调用API,因为收费API只能是塔内调用,其他没有遇到特别的,时间大概是3天出结果。
报备通过就可以创建应用了,创建成功后可以去买ECS和RDS等服务,别的服务暂时没用到,这两个服务都是绑定应用的,所以要先有应用。ECS的配置可选的比较少,我在北京,但机房目前只有杭州,所以你写了你的位置也没用,据说杭州那个机房是6线接入,出口质量和淘宝一样,但拿到ECS后ping了一下,31ms左右,第二天ping到了70ms左右,虽然ping不代表什么,但总感觉怪怪的。。。价格和阿里云的ECS相比有些贵。
管理终端有个密码,要改的话记得改完重启机器,否则不生效,而且重启后自己再用管理终端连接一下看看是不是生效了。
他们把10和172.16网段都用了,自己要用内网的话可以考虑多租几个ECS实例,但想玩LXC的话,可以考虑用192.168网段。
DNS不要换别的,就用ECS配置好的,据说改了影响内部访问和RDS通信,我改成114dns后直接无法使用。
我买的是最便宜的ECS,但不知道为何没有SWAP分区,512Mb内存在起了LXC后又编译了一个东西就超了,那里怎么都不行,于是加了SWAP,可以。
我想用聚石塔的推送服务,就是可以把订单、商品等数据推送到买的RDS中,但今天申请的时候对方说应用未上线,不能申请。。。我就是需要这些数据才能开发应用,这不就变成死锁了?
2015年1月7日:后来和客服沟通,说是应用需要先上线,即便你还没有开始开发,这点虽然有点怪怪的,但只有这样才能申请到同步服务。应用上线挺简单,创建完应用只是第一步,完善基本信息后点下一步即可进入安全扫描阶段,再提交,瞬间说审核通过,这就上线运行了。
2015年1月9日:商家后台应用有web和客户端两种架构方式,就是你做出一个web应用,一般就是用web的架构方式,用户登陆、授权后,你能拿到用户session,然后就可以用了,和大多数oauth的方式一样,客户端类似。但有一点需要注意,商家后台应用拿到用户的session和refresh token后,文档记载session有效期1年,但这个refresh token无用,只能等session到期后重新授权。
真正开始做了,发现API用起来也不那么爽,这个具体看个人的感觉了,但比拍拍强了不少。
目前就遇到这些坑,欢迎更新坑们。
]]>I have a raspberry pi several months ago, and I often use it to recording temp and humidity, uploading to a cloud host, but after a home movement, it had been stored in my drawer. Last day of 2014, while waiting 2015, I want build it and host an external disk for MacBookPro backup(Time Machine), but failed for power supply, normal power supply is not enough for raspi, so I should buy another USB Hub powered by itself.
My raspi used a wireless NIC to connecting with my router(OpenWrt), but I want use a ethernet cable for it, so it can get a unique IP on Internet, but when I pluged it, the wireless NIC would down, because of ifplugd or something else, so I keep the wireless NIC working and add some rules on route, port forward, WAN 80 to rasp 80, WAN 443 to rasp 443.
Install nginx viasudo apt-get install nginx
, remove default siterm /etc/nginx/sites-enable/default
, add conf in /etc/nginx/conf.d
, like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
I got a SSL certificate from nc.me(github student pack), so I rewrite HTTP to HTTPS.
start nginx via sudo nginx
or reload config via sudo nginx -s reload
.
Oh, your should update your domain A record to your router WAN ip address, you can write a script or use your router buildin DDNS function.
Octopress supported rsync publishment, so edit your Rakefile and enter your raspi ssh account and nginx html root path, like /usr/share/nginx/www
.
publish to github and rasp: rake gen_deploy rsync
or only to raspi: rake rsync
.
NotImplemented
in magic method(e.g. __eq__
, __ne__
, etc), see here for details.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
a
is instance of ClassA, and b
is instance of ClassB, if we invoke a == b
, we will get the result, True or False, but if we invoke b == a
, we will also get the result! Because of we return NotImplemented
in Class B’s __eq__
method, runtime could invoke Class A’s __eq__
, that method can compare A and B.
quote
as the second parameter, it means every characters except preserved characters will be replaced by a % leading characters, e.g. space will be replaced by %20, etc.
In my project, I invoke quote like quote(params, '')
, and I have imported unicode_literals
, so I passed a unicode parameter to quote
method.
What will happen?
Next time you invoke quote with the same safe
parameter you used before, you may get an error like UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
, urllib will cache the safe parameter(urllib.py 1277 line), it expects a byte type parameter but you pass an unicode one, also, that library didn’t check or convert it before it uses.
Best practice, pass a byte parameter like b''
to quote
if you have same usage(unicode literals) like me.
directory permission w means you can delete or create files or directories behind this
usr = unix software resource
“-” means last work directory(cd /mnt;cd;cd -)
umask means default permission minus this mask, e.g default new directory 777, if mask is 022, then you will get 777-022——>755 directory; default new file 666, is mask is 002, then you will get 666 - 002——>664 file. mask only can be set in (1, 2, 4) every bit, one bit means the sum of one permission group(current/group/others). umask is based on user, default root has a 0022 mask, normal user has a 0002 mask.
lsattr/chattr is available on ext2/ext3 filesystem, add ‘+i’ attribute can lock one file, no edit or delete, even you are root, ‘+a’ attribute can let a file only accept append content, etc.
SUID, SGID, SBIT special permission: there is a ’s’ on x position of user part, like ‘rws- - - - - -‘. SUID can only set to binary program, if user has x permission of this binary, it can temporarily get the owner permission of this binary, e.g binary ‘passwd’ and /etc/shadow file, shadow only can be updated by root, but every user can change their password via ‘passwd’, it means the user temporarily get owner(root) permission by SUID special permission. SGID similar as SUID, there will be a ’s’ on x position of group part like ‘rwx- - s - - x’. if a user can execute binary, he will temporarily get owner permission, e.g binary ‘locate’ and file ‘mlocate.db’. SGID can be set on a directory, it means if a user have r and x permission of one directory, and ’s’ set on this directory, this user can create a file which owner group is this directory’s. SBIT(Sticky BIT) only set to directory, if a directory has this permission, a user who can create file or directory on this directory, the new file or new directory can only delete by root or himself, e.g /tmp directory, root can delete your new file, and you can also delete it, but others can not. SUID = 4, SGID = 2, SBIT = 1, chmod x755 xxx_file, x means sum of SUID, SGID, SBIT, if omit, this bit is zero, if you set a invalid SUID/SGID/SBIT, there will be upper case S/T, lower case is valid.
find / -size 1M -exec ls -l {} \; {} means the args from find, “\;” means the end of “exec”, commands in exec can not use alias.
]]>create a user group named rmt_acc, only allow this group user access ssh, don’t allow root login and use public key authentication.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
|
1
|
|
don’t start telnet if reboot:
comment the line starts with ‘telnet’/‘login’/‘ftp’/‘shell’/‘exec’ in /etc/inetd.conf
1
|
|