由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
EmergingNetworking版 - Real world problem
相关主题
BGP 多线切换问题。。。BGP question
What could be the solution for the 20bit MPLS label length?请教一个load balancing的技术问题
问一个简单的PE router的问题iBGP fully meshed?
shift all outbound traffic to one BGP peerjunos/junosE HA: GR/NSR, BFD with GR/NSR
BGP Discontiguous AS问个BGP的弱智问题。
那个流量劫持是咋回事?CLNS MTU问题
Netflix CDN - Open Connect今天又犯错误了。
How many nexthops?大家说说看学习BGP要不要先弄明白IGP
相关话题的讨论汇总
话题: dns话题: anycast话题: bgp话题: problem话题: users
进入EmergingNetworking版参与讨论
1 (共1页)
s*****g
发帖数: 1055
1
This is a real world network problem, high hands please comment:
We have a global CDN network serving contents to a good number of Internet users, although we focus/care more on US users. In order to guarantee user experience, we place our contents in data centers strategically near major tier-1 providers in major exchanges, and use BGP anycast to allow user to access nearest contents. Geo-aware DNS will give out regional anycast address for DNS queries, such that data centers in a certain region can also backup each other. This works good enough for us for most users. But not for all users, so we want to have more regions, when we bring alive more data centers to better serve users, we are facing a new problem: we can not get new /24 address space, as you all know, many ISPs don't accept prefixes longer than /24, this means BGP anycast won't work for us anymore. The other problem with current solution is too much overhead to manage BGP in order to get everything right.
To solve this problem and to scale for the future growth, I am thinking of
building a customized geo-aware anycast DNS system, we can place our anycast DNS servers in Akamai's network, the DNS servers are not only geo-aware, but also they can dynamically monitor each data center's content engine load/health/latency etc, and give out the best answer to queries. This way our content engines can use any unicast IP addresses. Problem with this approach is that we have to set a small TTL for records because now DNS is handling redundancy, side effect of low TTL is that this will force end user to query our DNS server more frequently, this certainly will slow things down. Other problem is that end user can still cache the record, if the cached content engine IP happens to be down, users won't get content, vs, our current BGP anycast will guarantee that IPs gave out by DNS are alive somewhere. Not to mention that I am not sure how much effort is needed to integrate health monitoring with DNS -- I know it is doable.
Any thoughts how the problem can be solved in a more elegant way?
a**********k
发帖数: 1953
2
Not quite sure what the exact problem is. But you may
take a look at the BigIP GTM, it can do Geo-DNS,
health-monitoring among others:
http://www.f5.com/products/big-ip/global-traffic-manager.html

user
also
accept prefixes longer than /24, this means BGP anycast won't work for us
anymore. The other problem with current solution is too much overhead to
manage BGP in order to get everything right.

【在 s*****g 的大作中提到】
: This is a real world network problem, high hands please comment:
: We have a global CDN network serving contents to a good number of Internet users, although we focus/care more on US users. In order to guarantee user experience, we place our contents in data centers strategically near major tier-1 providers in major exchanges, and use BGP anycast to allow user to access nearest contents. Geo-aware DNS will give out regional anycast address for DNS queries, such that data centers in a certain region can also backup each other. This works good enough for us for most users. But not for all users, so we want to have more regions, when we bring alive more data centers to better serve users, we are facing a new problem: we can not get new /24 address space, as you all know, many ISPs don't accept prefixes longer than /24, this means BGP anycast won't work for us anymore. The other problem with current solution is too much overhead to manage BGP in order to get everything right.
: To solve this problem and to scale for the future growth, I am thinking of
: building a customized geo-aware anycast DNS system, we can place our anycast DNS servers in Akamai's network, the DNS servers are not only geo-aware, but also they can dynamically monitor each data center's content engine load/health/latency etc, and give out the best answer to queries. This way our content engines can use any unicast IP addresses. Problem with this approach is that we have to set a small TTL for records because now DNS is handling redundancy, side effect of low TTL is that this will force end user to query our DNS server more frequently, this certainly will slow things down. Other problem is that end user can still cache the record, if the cached content engine IP happens to be down, users won't get content, vs, our current BGP anycast will guarantee that IPs gave out by DNS are alive somewhere. Not to mention that I am not sure how much effort is needed to integrate health monitoring with DNS -- I know it is doable.
: Any thoughts how the problem can be solved in a more elegant way?

s*****g
发帖数: 1055
3
Put one F5 box in each location for DNS anycast? that is too expensive. Our philosophy if this is something we can do by ourselves, we don't buy. One or two F5 box definitely won't fit our bill, we want DNS query to our domain extremely fast. Besides, F5 can not get away with low TTL and cached DNS problem either. I know geo-aware DNS is readily available in BIND9, I am just not sure how hard it is (if possible) to integrate nagios/ganglia with BIND
I am not looking a specific vendor right now, just trying to get concept correct first.

【在 a**********k 的大作中提到】
: Not quite sure what the exact problem is. But you may
: take a look at the BigIP GTM, it can do Geo-DNS,
: health-monitoring among others:
: http://www.f5.com/products/big-ip/global-traffic-manager.html
:
: user
: also
: accept prefixes longer than /24, this means BGP anycast won't work for us
: anymore. The other problem with current solution is too much overhead to
: manage BGP in order to get everything right.

z**r
发帖数: 17771
4
"a new problem: we can not get new /24 address space", the anti-ipv6 guys
should see this, a real problem of IPv4 now.
so, the bottom line, the solution should satisfy your requirements for at
least next 3 to 5 years, if not longer
wan load balancers can resolve part of the issue, If you think F5 or Cisco
boxes are too expensive, maybe you can take a look at the linux solutions.
The box won't have much through traffic, but DNS resolving. the health check
shouldn't be a show stopper for you, a script may do it easily. but you are
right, the dns cache is annoying
Can we take a look at IPv6? If you don't know yet, Akamai has officially
started supporting IPv6 now, and I am sure your ISP support IPv6 too. It's
not easy, but maybe you can find it for boyh ipv4 and ipv6 users

users, although we focus/care more on US users. In order to guarantee user
experience, we place our contents in data centers strategically near major
tier-1 providers in majo
anycast DNS servers in Akamai's network, the DNS servers are not only geo-
aware, but also they can dynamically monitor each data center's content
engine load/health/latency etc, an

【在 s*****g 的大作中提到】
: This is a real world network problem, high hands please comment:
: We have a global CDN network serving contents to a good number of Internet users, although we focus/care more on US users. In order to guarantee user experience, we place our contents in data centers strategically near major tier-1 providers in major exchanges, and use BGP anycast to allow user to access nearest contents. Geo-aware DNS will give out regional anycast address for DNS queries, such that data centers in a certain region can also backup each other. This works good enough for us for most users. But not for all users, so we want to have more regions, when we bring alive more data centers to better serve users, we are facing a new problem: we can not get new /24 address space, as you all know, many ISPs don't accept prefixes longer than /24, this means BGP anycast won't work for us anymore. The other problem with current solution is too much overhead to manage BGP in order to get everything right.
: To solve this problem and to scale for the future growth, I am thinking of
: building a customized geo-aware anycast DNS system, we can place our anycast DNS servers in Akamai's network, the DNS servers are not only geo-aware, but also they can dynamically monitor each data center's content engine load/health/latency etc, and give out the best answer to queries. This way our content engines can use any unicast IP addresses. Problem with this approach is that we have to set a small TTL for records because now DNS is handling redundancy, side effect of low TTL is that this will force end user to query our DNS server more frequently, this certainly will slow things down. Other problem is that end user can still cache the record, if the cached content engine IP happens to be down, users won't get content, vs, our current BGP anycast will guarantee that IPs gave out by DNS are alive somewhere. Not to mention that I am not sure how much effort is needed to integrate health monitoring with DNS -- I know it is doable.
: Any thoughts how the problem can be solved in a more elegant way?

s*****g
发帖数: 1055
5
Thanks for your thoughts, IPv6 won't be an option at this moment, as IPv4
user is still dominant. Plus, Akamai only serves contents, our CDN does not
only serve content, it does other stuff that Akamai can not offer.

check
are

【在 z**r 的大作中提到】
: "a new problem: we can not get new /24 address space", the anti-ipv6 guys
: should see this, a real problem of IPv4 now.
: so, the bottom line, the solution should satisfy your requirements for at
: least next 3 to 5 years, if not longer
: wan load balancers can resolve part of the issue, If you think F5 or Cisco
: boxes are too expensive, maybe you can take a look at the linux solutions.
: The box won't have much through traffic, but DNS resolving. the health check
: shouldn't be a show stopper for you, a script may do it easily. but you are
: right, the dns cache is annoying
: Can we take a look at IPv6? If you don't know yet, Akamai has officially

z**r
发帖数: 17771
6
when I said IPv6 solution and it's not easy to find it, I meant tho your
data cnter is ipv6, it should allow ipv4 users access it. it might be
complicated, but I think you can find the solution. a lot more details
needed.

not

【在 s*****g 的大作中提到】
: Thanks for your thoughts, IPv6 won't be an option at this moment, as IPv4
: user is still dominant. Plus, Akamai only serves contents, our CDN does not
: only serve content, it does other stuff that Akamai can not offer.
:
: check
: are

f*****m
发帖数: 416
7
为啥不能用老的anycast address?

users, although we focus/care more on US users. In order to guarantee user
experience, we place our contents in data centers strategically near major
tier-1 providers in major exchanges, and use BGP anycast to allow user to
access nearest contents. Geo-aware DNS will give out regional anycast
address for DNS queries, such that data centers in a certain region can also
backup each other. This works good enough for us fo: r most users. But not
for all users, so we want to have more regions, when we bring alive more
data centers to better serve users, we are facing a new problem: we can not
get new /24 address space, as you all know, many ISPs don't accept prefixes
longer than /24, this means BGP anycast won't work for us anymore. The
other problem with current solution is too much overhead to manage BGP in
order to get everything right.
anycast DNS servers in Akamai's network, the DNS servers are not only geo-
aware, but also they can dynamically monitor each data center's content
engine load/health/latency etc, and give out the best answer to queries.
This way our content engines can use any unicast IP addresses. Problem with
this approach is that we have to set a small TTL for records because now DNS
is handling redundancy, side effect of low TTL is that th: is will force
end user to query our DNS server more frequently, this certainly will slow
things down. Other problem is that end user can still cache the record, if
the cached content engine IP happens to be down, users won't get content,
vs, our current BGP anycast will guarantee that IPs gave out by DNS are
alive somewhere. Not to mention that I am not sure how much effort is needed
to integrate health monitoring with DNS -- I know it is doable.

【在 s*****g 的大作中提到】
: This is a real world network problem, high hands please comment:
: We have a global CDN network serving contents to a good number of Internet users, although we focus/care more on US users. In order to guarantee user experience, we place our contents in data centers strategically near major tier-1 providers in major exchanges, and use BGP anycast to allow user to access nearest contents. Geo-aware DNS will give out regional anycast address for DNS queries, such that data centers in a certain region can also backup each other. This works good enough for us for most users. But not for all users, so we want to have more regions, when we bring alive more data centers to better serve users, we are facing a new problem: we can not get new /24 address space, as you all know, many ISPs don't accept prefixes longer than /24, this means BGP anycast won't work for us anymore. The other problem with current solution is too much overhead to manage BGP in order to get everything right.
: To solve this problem and to scale for the future growth, I am thinking of
: building a customized geo-aware anycast DNS system, we can place our anycast DNS servers in Akamai's network, the DNS servers are not only geo-aware, but also they can dynamically monitor each data center's content engine load/health/latency etc, and give out the best answer to queries. This way our content engines can use any unicast IP addresses. Problem with this approach is that we have to set a small TTL for records because now DNS is handling redundancy, side effect of low TTL is that this will force end user to query our DNS server more frequently, this certainly will slow things down. Other problem is that end user can still cache the record, if the cached content engine IP happens to be down, users won't get content, vs, our current BGP anycast will guarantee that IPs gave out by DNS are alive somewhere. Not to mention that I am not sure how much effort is needed to integrate health monitoring with DNS -- I know it is doable.
: Any thoughts how the problem can be solved in a more elegant way?

s*****g
发帖数: 1055
8
Good question, because we need to have more anycast address groups to have finer region designation, such that people, say from Santa Clara county, will go to a DC in Santa Clara, not to the DC is LA which servers west coast.
Theoretically we can do with just one anycast range, but performance will suck, more importantly, it is hard to design fall back, because routing is mostly out of our control. We have to do geo-aware DNS at the same time with BGP anycast. Each region has its own anycast range, we now have to have finer regions, hence more anycast ranges.

user
major
also
not
not
prefixes

【在 f*****m 的大作中提到】
: 为啥不能用老的anycast address?
:
: users, although we focus/care more on US users. In order to guarantee user
: experience, we place our contents in data centers strategically near major
: tier-1 providers in major exchanges, and use BGP anycast to allow user to
: access nearest contents. Geo-aware DNS will give out regional anycast
: address for DNS queries, such that data centers in a certain region can also
: backup each other. This works good enough for us fo: r most users. But not
: for all users, so we want to have more regions, when we bring alive more
: data centers to better serve users, we are facing a new problem: we can not

f*****m
发帖数: 416
9

finer region designation, such that people, say from Santa Clara county,
will go to a DC in Santa Clara, not to the DC is LA which servers west coast.
如果别的条件都一样的话, IGP cost应该会作为条件,这样Santa Clara的DC还是优先.
suck, more importantly, it is hard to design fall back, because routing is
mostly out of our control. We have to do geo-aware DNS at the same time with
BGP anycast. Each region has its own anycast range, we now have to have
finer regions, hence more anycast ranges.

【在 s*****g 的大作中提到】
: Good question, because we need to have more anycast address groups to have finer region designation, such that people, say from Santa Clara county, will go to a DC in Santa Clara, not to the DC is LA which servers west coast.
: Theoretically we can do with just one anycast range, but performance will suck, more importantly, it is hard to design fall back, because routing is mostly out of our control. We have to do geo-aware DNS at the same time with BGP anycast. Each region has its own anycast range, we now have to have finer regions, hence more anycast ranges.
:
: user
: major
: also
: not
: not
: prefixes

s*****g
发帖数: 1055
10
We send communities for ISPs to set preference, IGP cost is way down in BGP's decision chain, besides, we are peering with different AS in different locations.

have
coast.
with

【在 f*****m 的大作中提到】
:
: finer region designation, such that people, say from Santa Clara county,
: will go to a DC in Santa Clara, not to the DC is LA which servers west coast.
: 如果别的条件都一样的话, IGP cost应该会作为条件,这样Santa Clara的DC还是优先.
: suck, more importantly, it is hard to design fall back, because routing is
: mostly out of our control. We have to do geo-aware DNS at the same time with
: BGP anycast. Each region has its own anycast range, we now have to have
: finer regions, hence more anycast ranges.

相关主题
那个流量劫持是咋回事?BGP question
Netflix CDN - Open Connect请教一个load balancing的技术问题
How many nexthops?iBGP fully meshed?
进入EmergingNetworking版参与讨论
f*****m
发帖数: 416
11
 如果是cost-community的话,还是在IGP的后面. 你们用自己的ASN 还是SP的? 如
果用SP的,自然是本地的优先啊.

BGP's decision chain, besides, we are peering with different AS in different
locations.

【在 s*****g 的大作中提到】
: We send communities for ISPs to set preference, IGP cost is way down in BGP's decision chain, besides, we are peering with different AS in different locations.
:
: have
: coast.
: with

R*****A
发帖数: 127
12
practical solution:
1. bgp anycast say 1.1.1.1/24 (a risk here is that best bgp path does not
mean it has the best path for traffic (RRT, bandwidth, etc)
2. Global DNS always return CDN.com to 1.1.1.1/24 (singe address DNS A
record, or geo-aware multiple A record DNS, whatever)
3. Then for a specific addr, say 1.1.1.1/24 from one ISP, traffic load is
heavier and you want to do load balancing. Then an geo-DNS is placed here (
this one is not the global-zone). It restricted to return query from
specific region) and you can place multiple addr (1.1.1.1, 2.2.2.1,2.2.2.2
etc) for this A record.
4. so globally BGP anycast with top DNS A record. When query from specific
region, using this specific sub-regions zone-based multiple A record DNS. As
the result, you are able to use more IPv4 addr, announcing more IPs (with
BGP anycast if you want), load balancing, etc...
5. keep in mind there is no best solution there. Internet is messy since day
one.
p*****s
发帖数: 344
13
I am no expert on this, just try to understand your problem.
you want internet user to reach nearest content server.
you want your content servers can backup each other.
you are running out of /24 addresses. and ISP doesn't support smaller suffix.
you are using anycast to achieve load balance. apparently using ISP anycast
support saves you the load director and health monitor.
so you have to implement your own load director and health monitor. And to
save the cost(or for your own interest) you want the DNS to have these extra
two functions.
The math doesn't add up here.
BTW, how many regions you are talking about?
s*****g
发帖数: 1055
14
You understood my problem perfectly, can you elaborate "the math does not
add up here"?

suffix.
anycast
extra

【在 p*****s 的大作中提到】
: I am no expert on this, just try to understand your problem.
: you want internet user to reach nearest content server.
: you want your content servers can backup each other.
: you are running out of /24 addresses. and ISP doesn't support smaller suffix.
: you are using anycast to achieve load balance. apparently using ISP anycast
: support saves you the load director and health monitor.
: so you have to implement your own load director and health monitor. And to
: save the cost(or for your own interest) you want the DNS to have these extra
: two functions.
: The math doesn't add up here.

m**t
发帖数: 1292
15
没想通为什么用一个 anycast range performance 会 suck, google 的 public DNS 8
.8.8.8 就是这样做的吧
fall back 是哪个 level fall back?

finer region designation, such that people, say from Santa Clara county,
will go to a DC in Santa Clara, not to the DC is LA which servers west coast.
suck, more importantly, it is hard to design fall back, because routing is
mostly out of our control. We have to do geo-aware DNS at the same time with
BGP anycast. Each region has its own anycast range, we now have to have
finer regions, hence more anycast ranges.

【在 s*****g 的大作中提到】
: Good question, because we need to have more anycast address groups to have finer region designation, such that people, say from Santa Clara county, will go to a DC in Santa Clara, not to the DC is LA which servers west coast.
: Theoretically we can do with just one anycast range, but performance will suck, more importantly, it is hard to design fall back, because routing is mostly out of our control. We have to do geo-aware DNS at the same time with BGP anycast. Each region has its own anycast range, we now have to have finer regions, hence more anycast ranges.
:
: user
: major
: also
: not
: not
: prefixes

s*****g
发帖数: 1055
16
1) We don't have that much resources as Google has, Google can place their
DNS server in hundreds or thousands data centers
2) We rely on ISPs to fall back our traffic between DCs, I believe Google
has more control of how 8.8.8.8 is routed. It would be unacceptable for us
to fall a DC in LA to a DC in NY
3) Our application is different from public DNS, we have stringent SLA
requirement, while an end user can tolerate longer DNS response(it is free,
right?), we will lose money if a content is not served within certain time
range.
There are other reasons why we need more than one anycast range, but for the
sake of discussion, just assume that we have to have more than one anycast
range.

8
coast.
with

【在 m**t 的大作中提到】
: 没想通为什么用一个 anycast range performance 会 suck, google 的 public DNS 8
: .8.8.8 就是这样做的吧
: fall back 是哪个 level fall back?
:
: finer region designation, such that people, say from Santa Clara county,
: will go to a DC in Santa Clara, not to the DC is LA which servers west coast.
: suck, more importantly, it is hard to design fall back, because routing is
: mostly out of our control. We have to do geo-aware DNS at the same time with
: BGP anycast. Each region has its own anycast range, we now have to have
: finer regions, hence more anycast ranges.

p*****s
发帖数: 344
17
I mean you expect your own solution can replace ISP anycast which supports
both health monitor and load director with less cost and no performance
penalty. That sounds too good to be true.

【在 s*****g 的大作中提到】
: You understood my problem perfectly, can you elaborate "the math does not
: add up here"?
:
: suffix.
: anycast
: extra

1 (共1页)
进入EmergingNetworking版参与讨论
相关主题
大家说说看学习BGP要不要先弄明白IGPBGP Discontiguous AS
这里要不要multihop一下啊?那个流量劫持是咋回事?
哈哈哈,bgp established了。多谢大家!Netflix CDN - Open Connect
F5 balance怎么学啊?How many nexthops?
BGP 多线切换问题。。。BGP question
What could be the solution for the 20bit MPLS label length?请教一个load balancing的技术问题
问一个简单的PE router的问题iBGP fully meshed?
shift all outbound traffic to one BGP peerjunos/junosE HA: GR/NSR, BFD with GR/NSR
相关话题的讨论汇总
话题: dns话题: anycast话题: bgp话题: problem话题: users