gapis

配置格式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
{
// Load balancing policy name.
// Currently, the only selectable client-side policy provided with gRPC
// is 'round_robin', but third parties may add their own policies.
// This field is optional; if unset, the default behavior is to pick
// the first available backend.
// If the policy name is set via the client API, that value overrides
// the value specified here.
//
// Note that if the resolver returns at least one balancer address (as
// opposed to backend addresses), gRPC will use grpclb (see
// https://github.com/grpc/grpc/blob/master/doc/load-balancing.md),
// regardless of what LB policy is requested either here or via the
// client API.
'loadBalancingPolicy': string,

// Per-method configuration. Optional.
'methodConfig': [
{
// The names of the methods to which this method config applies. There
// must be at least one name. Each name entry must be unique across the
// entire service config. If the 'method' field is empty, then this
// method config specifies the defaults for all methods for the specified
// service.
//
// For example, let's say that the service config contains the following
// method config entries:
//
// 'methodConfig': [
// { 'name': [ { 'service': 'MyService' } ] ... },
// { 'name': [ { 'service': 'MyService', 'method': 'Foo' } ] ... }
// ]
//
// For a request for MyService/Foo, we will use the second entry, because
// it exactly matches the service and method name.
// For a request for MyService/Bar, we will use the first entry, because
// it provides the default for all methods of MyService.
'name': [
{
// RPC service name. Required.
// If using gRPC with protobuf as the IDL, then this will be of
// the form "pkg.service_name", where "pkg" is the package name
// defined in the proto file.
'service': string,

// RPC method name. Optional (see above).
'method': string,
}
],

// Whether RPCs sent to this method should wait until the connection is
// ready by default. If false, the RPC will abort immediately if there
// is a transient failure connecting to the server. Otherwise, gRPC will
// attempt to connect until the deadline is exceeded.
//
// The value specified via the gRPC client API will override the value
// set here. However, note that setting the value in the client API will
// also affect transient errors encountered during name resolution,
// which cannot be caught by the value here, since the service config
// is obtained by the gRPC client via name resolution.
'waitForReady': bool,

// The default timeout in seconds for RPCs sent to this method. This can
// be overridden in code. If no reply is received in the specified amount
// of time, the request is aborted and a deadline-exceeded error status
// is returned to the caller.
//
// The actual deadline used will be the minimum of the value specified
// here and the value set by the application via the gRPC client API.
// If either one is not set, then the other will be used.
// If neither is set, then the request has no deadline.
//
// The format of the value is that of the 'Duration' type defined here:
// https://developers.google.com/protocol-buffers/docs/proto3#json
'timeout': string,

// The maximum allowed payload size for an individual request or object
// in a stream (client->server) in bytes. The size which is measured is
// the serialized, uncompressed payload in bytes. This applies both
// to streaming and non-streaming requests.
//
// The actual value used is the minimum of the value specified here and
// the value set by the application via the gRPC client API.
// If either one is not set, then the other will be used.
// If neither is set, then the built-in default is used.
//
// If a client attempts to send an object larger than this value, it
// will not be sent and the client will see an error.
// Note that 0 is a valid value, meaning that the request message must
// be empty.
'maxRequestMessageBytes': number,

// The maximum allowed payload size for an individual response or object
// in a stream (server->client) in bytes. The size which is measured is
// the serialized, uncompressed payload in bytes. This applies both
// to streaming and non-streaming requests.
//
// The actual value used is the minimum of the value specified here and
// the value set by the application via the gRPC client API.
// If either one is not set, then the other will be used.
// If neither is set, then the built-in default is used.
//
// If a server attempts to send an object larger than this value, it
// will not be sent, and the client will see an error.
// Note that 0 is a valid value, meaning that the response message must
// be empty.
'maxResponseMessageBytes': number
}
]
}

示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
{
"methodConfig": [
{
"name": [
{
"service": "google.analytics.admin.v1alpha.AnalyticsAdminService"
}
],
"timeout": "60s",
"retryPolicy": {
"maxAttempts": 5,
"initialBackoff": "1s",
"maxBackoff": "60s",
"backoffMultiplier": 1.3,
"retryableStatusCodes": [
"UNAVAILABLE",
"UNKNOWN"
]
}
},
{
"name": [
{
"service": "google.analytics.admin.v1alpha.AnalyticsAdminService",
"method": "GetAccount"
},
{
"service": "google.analytics.admin.v1alpha.AnalyticsAdminService",
"method": "ListAccounts"
},
{
"service": "google.analytics.admin.v1alpha.AnalyticsAdminService",
"method": "DeleteAccount"
}
],
"timeout": "60s"
}
]
}

API - 规范

目标

接口设计应以一些目标为导向,这些目标包含且不限于,易读、易懂、易用、清晰(意义明确,不易误用)、易维护、易扩展、功能强大且满足需求。要达到如上目标,在设计 API 时应考虑如下细节。

  • 文档
  • 控制器和动作命名公约
  • 稳定性且一致性
  • 灵活性
  • 安全性
  • 有效性校验
  • HTTP 状态码
  • 帮助页面
  • 日志

从形式来讲,API 不仅有 HTTP/WEB,还有 gRPC、Thrift。

视角

对于思考设计良好的接口,可从下面两个视角着手。

  • API 实现视角
    • 这个服务需要做什么
    • 这个服务需要提供什么
    • 怎么才能使 API 更通用 (输入、输出、扩展性)
  • API 使用者视角
    • 使用者如何继承我们的 API
    • 如何让使用者更灵活地向 API 提供数据并获取输出
    • 如何让使用者花更少的时间获得他们所需要的信息

规范

分类

Errors(错误)

Errors 是指客户端向服务发送错误数据,服务正确拒绝该数据。Errors 不会影响服务可用性

Faults(故障)

Faults 是指服务对合法的请求无法正常返回结果。Faults 会影响服务可用性。

由于限流或配额失败(事先设定)引起的调用失败,不能算作 faults。如果是由于服务自我保护造成的请求失败,算作 faults,比如快速失败策略。

Latency(延迟)

Latency 是指完成特定 API 调用消耗的时间,尽可能接近客户端计量。对于长操作请求,该指标记为初始请求耗时。

Time to complete

对于暴露长操作的服务,必须对这些指标跟踪 “Time to complete” 指标。

Long running API faults

对于长操作,初始请求和检索请求都可能正常,但如果最终操作失败,必须汇总至总体的可用性指标中。

客户端规范

忽略原则

忽略服务端返回的多余字段。

可变顺序规则

忽略服务端返回数据字段的顺序。

一致性基础

URL 结构

URL 应该易读且易构造。

URL 长度

HTTP 1.1 RFC 7230 并未定义 URL 长度限制,如果服务接收到的请求 URL 长度大于其定义的限制长度,应返回 414 状态码。

所以,对于长度大于 2083 个字符的 URL ,应考虑服务是否可以接受。

支持的方法

方法 描述 是否幂等
GET 返回当前对象值
PUT 替换或创建对象
DELETE 删除对象
POST 根据提供的数据创建新对象,或提交命令
HEAD 为 GET 响应返回对象的元数据,资源支持 GET 请求,也应支持 HEAD 请求
PATCH 对对象应用重要的更新
OPTIONS 获取请求的信息

自定义 Headers

跟定 API 的基本操作,不能指定自定义 Headers。

命名公约

  • 请求和返回值参数,不应使用缩写(比如 msg)

response

返回值数据结构应分级,不应将业务数据放在第一级。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# 异常
{
"status": 1000,
"errors": {
"target": "ContactInfo",
"message": "Multiple errors in ContactInfo data",
"details": [
{
"code": "NullValue",
"target": "PhoneNumber",
"message": "Phone number must not be null"
},
{
"code": "MalformedValue",
"target": "Address",
"message": "Address is not valid"
}
]
}
}

# 正常
{
"status": 0,
"data": {
"contacts": [
{
"name": "cmcc",
"phone": "10086"
}
]
}
}

对于 code,应进行细致的划分,比如。

状态码 说明 HTTP Status Code
0 请求正常返回 200
1000+ 请求错误(参数、数据不存在等) 400
2000+ 元数据读取异常(不存在、格式异常) 200
3000+ 处理时异常 500
4000+ 数据写入时异常 500
5000+ 未知服务异常 500

对于 message 格式也应进行统一(避免英文/中文混用,有的 message 为英文,有的为中文),务必保证 code 不为 0 时返回有效 message。

QA

有一些设计细节,很难判定那种方式实现比较好,这里做下讨论。

多个功能使用一个接口 VS 多个接口

对于功能相似的多个接口,是使用一个接口 + 字段标识,还是拆分成多个接口。

参考

top

操作

1
2
3
4
5
6
c  展示详细 COMMAND
m 切换内存展示
e 切换进程内存展示单位
E 切换顶部内存信息的单位
P 按 CPU 排序
W 保存当前配置到用户目录 (~/.toprc OR ~/.config/prop/toprc OR ...)

排序

指定队列排序

  • 输入命令 top
  • 输入 o
    • 再输入列名(大小写不敏感)
  • 输入 O
    • 再输入列名(大小写不敏感),设置第二排序键

指定 pid

指定单个

1
$ top -p <pid>      # macos 使用 -pid 参数

指定多个

1
$ top -p `pgrep -d ',' python3`  # 非 macos

指定用户

1
$ top -u wii       # 只显示指定用户进程, macos 使用 -U 参数

bazel - deps

google-apis

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
http_archive(
name = "com_google_googleapis",
strip_prefix = "googleapis-8b976f7c6187f7f68956207b9a154bc278e11d7e",
urls = ["https://github.com/googleapis/googleapis/archive/8b976f7c6187f7f68956207b9a154bc278e11d7e.tar.gz"],
)

load("@com_google_googleapis//:repository_rules.bzl", "switched_rules_by_language")

switched_rules_by_language(
name = "com_google_googleapis_imports",
gapic = True,
grpc = True,
java = True,
python = True,
)

google api common protos

1
2
3
4
5
6
com_google_googleapis 包含了 common protos; 暂时保留
http_archive(
name = "com_google_api_common_protos",
strip_prefix = "api-common-protos-1db64f2e971e7ea0e15769164a67643539f0994f",
urls = ["https://github.com/googleapis/api-common-protos/archive/1db64f2e971e7ea0e15769164a67643539f0994f.tar.gz"],
)

linux commands misc

netstat

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 参数
-a 列出所有连接
-t TCP连接
-u UDP连接
-t 禁用反向DNS查找,提高输出速度
-l 只列出正在监听的端口
-p 列出PID和程序名称
-e 列出程序所属用户
-s 打印网络数据(接受、发送包统计等数据)
-r 打印路由信息
-i 打印网络接口信息
-c 持续打印网络信息

# 常用
$ netstat -ct # 获取持续输出
$ netstat -atnp # 获取所有活动的TCP连接
$ netstat -ie # 打印用户友好的网络接口信息

watch

1
2
3
4
5
6
7
8
9
# 参数
-n 设置间隔时间
-d 高亮显示变化区域
-t 关闭顶部的时间间隔、命令、当前时间信息

# 示例
$ watch -n 1 -d netstat -ant # 观察每秒网络连接变化
$ watch -n 1 -d 'pstree|grep http' # 观察每秒http链接的变化
$ watch -n 10 'cat /proc/loadavg' # 每10秒输出一次系统平均负载

awk

1
2
# 示例
$ awk '{ print $1 }' # 打印首列

sed

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 格式
$ sed [-Ealn] [-e command] [-f command_file] [-i extension] [file ...]
$ sed [-Ealn] command [file ...]
# command
[address[,address]]function[arguments]
# 示例
$ 1,20s/old/new/g
# 参数
-n slilent模式,是输出处理行
-e 通过命令行参数附加编辑操作
-i inplace 修改文件
-f 指定sed命令文件
# funciont
a 新增(后)
i 插入(前)
c 替换
d 删除
p 打印选择数据
s 取代
# 匹配
## () 匹配模式,\1, \2 使用模式值
$ echo "http://localhost:8080/uri/path?p=v" | sed -e 's/^\([^:]*\):\/\/\([^:]*\):\([0-9]*\)\(.*\)$/protocol=[\1] host=[\2] port=[\3] pathAndParams=[\4]/g'
protocol=[http] host=[localhost] port=[8080] pathAndParams=[/uri/path?p=v]

# 示例
$ 1,20s/old/new/g 替换1~20行内的old为new
$ 2,5d 删除2~5行
$ 3,$d 删除第三行至结尾数据

cut

1
2
3
4
# 示例
$ cut -d ' ' -f3,5 # 打印第3,5列
$ echo "localhost:8080" | cut -d ':' -f1 # 提取host
localhost

tr

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# tr: translate characters, 转换和删除字符
# 格式
tr [-Ccsu] string1 string2 # 替换 string1 中字符为 string2 中位置对应的字符
tr [-Ccu] -d string1
tr [-Ccu] -s string1
tr [-Ccu] -ds string1 string2
# 参数
-d 删除指令字符
-c 反选指令字符串
-C 类似-c,反选指令集中字符

# class
[:class:] Represents all characters belonging to the defined character class. alnum <alphanumeric characters>
alpha <alphabetic characters>
blank <whitespace characters>
cntrl <control characters>
digit <numeric characters>
graph <graphic characters>
ideogram <ideographic characters>
lower <lower-case alphabetic characters>
phonogram <phonographic characters>
print <printable characters>
punct <punctuation characters>
rune <valid characters>
space <space characters>
special <special characters>
upper <upper-case characters>
xdigit <hexadecimal characters>

# 示例
$ echo "What a cute dog" | tr a-z A-Z
WHAT A CUTE DOG
$ $ echo "What a cute dog" | tr [:lower:] [:upper:]
WHAT A CUTE DOG

sort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# 排序
# 格式
sort [-bcCdfghiRMmnrsuVz] [-k field1[,field2]] [-S memsize] [-T dir] [-t char] [-o output] [file ...]

# 参数
-u 删除重复key
-s 稳定排序
-b 忽略开头空白符
-d 字典序输出
-i 忽略不可打印字符
-R 乱序输出
-n 数字排序
-t 指定分隔符
-k 指定排序字段

# 示例
$ cat t2
10.0.0.1:8080
10.0.0.2:8080
10.0.0.1:8090
10.0.0.3:8070
10.0.0.1:8060
$ cat t2 | sort -t ':' -k 1
10.0.0.1:8060
10.0.0.1:8080
10.0.0.1:8090
10.0.0.2:8080
10.0.0.3:8070
$ cat t2 | sort -t ':' -k 2
10.0.0.1:8060
10.0.0.3:8070
10.0.0.1:8080
10.0.0.2:8080
10.0.0.1:8090

uniq

1
# 删除重复行,一般与sort结合使用

date

1
2
3
4
5
6
7
8
9
10
# format
date +"%Y%m%d"
# 分钟
date +"%M"

# minus
-d "-1 days"

# 前一台日期
date -d "-1 days" +"%Y-%m-%d"

json

1
2
3
4
5
6
7
8
# 美化 json 字符串
$ echo '{"data":{"name":"wii","age":18}}' | python -m json.tool
{
"data": {
"age": 18,
"name": "wii"
}
}

time

1
2
3
4
5
6
# 统计程序运行时间
time <program args>
...
real 0m0.003s
user 0m0.001s
sys 0m0.002s

python coroutines

协程

由程序负责任务切换,可以减少线程/进程上下文切换的消耗。用户态实现任务切换,无需进入内核态。

用途

虽然 Python 有多线程的概念,但是由于 GIL 的存在,并不能利用多核资源。如果易不能充分利用单进程资源,可能会带来严重的性能问题。

相关

EventLoop

python 默认只为主线程创建 loop。如下 tornado 代码实现了自动为创建 loop 的功能,使用 asyncio.set_event_loop_policy(AnyThreadEventLoopPolicy()) 来生效。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
if sys.platform == "win32" and hasattr(asyncio, "WindowsSelectorEventLoopPolicy"):
# "Any thread" and "selector" should be orthogonal, but there's not a clean
# interface for composing policies so pick the right base.
_BasePolicy = asyncio.WindowsSelectorEventLoopPolicy # type: ignore
else:
_BasePolicy = asyncio.DefaultEventLoopPolicy


class AnyThreadEventLoopPolicy(_BasePolicy): # type: ignore
"""Event loop policy that allows loop creation on any thread.

The default `asyncio` event loop policy only automatically creates
event loops in the main threads. Other threads must create event
loops explicitly or `asyncio.get_event_loop` (and therefore
`.IOLoop.current`) will fail. Installing this policy allows event
loops to be created automatically on any thread, matching the
behavior of Tornado versions prior to 5.0 (or 5.0 on Python 2).

Usage::

asyncio.set_event_loop_policy(AnyThreadEventLoopPolicy())

.. versionadded:: 5.0

"""

def get_event_loop(self) -> asyncio.AbstractEventLoop:
try:
return super().get_event_loop()
except (RuntimeError, AssertionError):
# This was an AssertionError in Python 3.4.2 (which ships with Debian Jessie)
# and changed to a RuntimeError in 3.4.3.
# "There is no current event loop in thread %r"
loop = self.new_event_loop()
self.set_event_loop(loop)
return loop

示例

定时器

下面是使用协程实现的定时器。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# coding: utf-8
import asyncio
import threading
import time
from datetime import datetime
from typing import Callable


class Scheduler:
cache: set[str] = set()

@classmethod
async def _do_schedule(cls, name: str, delay: int, interval: int, cb: Callable, args, kwargs):
await asyncio.sleep(delay)
while name in cls.cache:
try:
cb(*args, **kwargs)
except Exception as e:
print('execute target failed, e=', e)
await asyncio.sleep(interval)

@classmethod
def _schedule_wrapper(cls, name: str, delay: int, interval: int, cb: Callable, args, kwargs):
asyncio.run(cls._do_schedule(name, delay, interval, cb, args, kwargs))

@classmethod
def schedule(cls, name: str, delay: int, interval: int, cb: Callable, *args, **kwargs):
assert name not in cls.cache, 'duplicate scheduler with name ' + name
threading.Thread(target=cls._schedule_wrapper,
args=(name, delay, interval, cb, args, kwargs),
daemon=True).start()

cls.cache.add(name)

@classmethod
def stop(cls, name: str):
if name in cls.cache:
cls.cache.remove(name)


def cbk(a, b, c):
print('execute at', datetime.now(), 'with args:', (a, b, c))


if __name__ == '__main__':
Scheduler.schedule('first', 1, 1, cbk, 'a', 'b', c='c')
Scheduler.schedule('second', 1, 1, cbk, 'd', 'e', c='f')
time.sleep(3)
Scheduler.stop('first')
try:
while True:
pass
except KeyboardInterrupt:
pass

异常

loop argument must agree with Future

下看下抛出异常的代码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def ensure_future(coro_or_future, *, loop=None):
"""Wrap a coroutine or an awaitable in a future.
If the argument is a Future, it is returned directly.
"""
if futures.isfuture(coro_or_future):
if loop is not None and loop is not coro_or_future._loop:
raise ValueError('loop argument must agree with Future')
return coro_or_future
elif coroutines.iscoroutine(coro_or_future):
if loop is None:
loop = events.get_event_loop()
task = loop.create_task(coro_or_future)
if task._source_traceback:
del task._source_traceback[-1]
return task
elif compat.PY35 and inspect.isawaitable(coro_or_future):
return ensure_future(_wrap_awaitable(coro_or_future), loop=loop)
else:
raise TypeError('A Future, a coroutine or an awaitable is required')

参考

curator x discovery

概念

  • ServiceDiscovery ,创建 ServiceProvider 对象,首先需要有 ServiceDiscovery ;所有请求直接访问 zk
  • ServiceProvider, 特定服务发现的封装,并集成了负载均衡策略;集成了 ServiceCache ,有节点监听和缓存
  • ServiceCache ,会在本地内存缓存,并使用 watcher 来保持数据最新

说明

ServiceDiscoveryServiceProvider 需要调用 start 方法后可用。

注册

  • 使用 ServiceDiscoveryregisterService 注册服务后,只要 ServiceDiscoverystop ,会一直保持节点注册
  • 服务被强制 stop ,没有及时调用 unregisterService 接口来取消注册,zk 节点会保存一段时间(几秒),然后由 zk 摘除

查询

  • ServiceProvider 的接口,会实时调用 zk 查询数据,

监听

ServiceCacheListener 有两个方法。

  • cacheChanged 当服务节点变化时,会调用该方法
  • stateChanged 当 zk 连接状态变化时,会调用该方法