All pastes #2126526 Raw Edit

Stuff

public text v1 · immutable
#2126526 ·published 2012-03-10 02:47 UTC
rendered paste body
From: "Adam Rothschild" <asr@voxel.net>
To: nyiix@telehouse.com
Sent: Friday, March 9, 2012 9:46:24 PM
Subject: Re: NYIIX: ISPnet peering at NYIIX

Adding to the below, I'd be interested in seeing a detailed postmortem,
with a focus on steps Telehouse is taking to make sure this kind of
customer [mis]configuration will never, ever, take down NYIIX in the
future.  Unfortunately the information out of official channels has been
suspiciously sparse thus far.  Blaming this all on a software fault in
Brocade's IronWare is disingenuous to customers, and stops short of
painting a full picture of the problem set.

A common best-practice among exchange operators these days is enabling
port security, along with the filtering of keepalive/discovery protocols
and other possible L2 noise.    In addition, sniffers/collectors exist to
look for any leaks coming out of customer ports, which can then be quickly
traced back and remediated.  Hopefully this is an architectural
optimization we can count on seeing in the coming days/weeks, though
absent any commentary, it's difficult to even speculate.

I also remember some stated plans of moving away from a flat VLAN and
towards a VPLS-based topology, benefits being better route protection and
separation of control and data-plane traffic.  Is this still something we
can count on, and when?

In line with the below, we made the executive decision to disable our
NYIIX port  for a ~24 hour window following yesterday's issues, in hopes
of ensuring platform stability.  In the future, peers can expect a similar
or lengthened response.

-a (Voxel / Internap, AS 29791, http://as29791.peeringdb.com)

On 3/9/12 8:36 PM, "Ken Cen" <kcen@starlan.com> wrote:

>We have clients reported data corruption - ie, files sent over the NYIIX
>was corrupt.
>
>What would it take the NYIIX to provide high availability switching
>fabric?  If it is already commissioned, why wasn't it working?
>
>Thank you
>
>-Ken
>
>-----Original Message-----
>From: Bob Tinkelman [mailto:bob@tink.com] On Behalf Of ISPnet Peering
>Desk
>Sent: Friday, March 09, 2012 7:19 PM
>To: nyiix@telehouse.com
>Subject: NYIIX: ISPnet peering at NYIIX
>
>We are ISPnet (as22691).
>
>All our bgp sessions at the NYIIX are (intentionally) down.
>
>When last night's Telehouse NYIIX problems caused significant problems
>on our backbone switches, we shut down our NYIIX link.
>
>We hope to be able to turn the port back up within 2 weeks, but will
>most likely delay until either we are convinced that Telehouse has fixed
>the underlying problem or until we implement a method to prevent any
>such reoccurance from propagating to our net.
>
>We'll email again when we have a firm schedule.
>
>In the mean time, if any other NYIIX members were able to determine what
>sort of traffic was being emitted from the nyiix switches during last
>envening's problems, I'd appreciate learning what you know.  I've asked
>Telehouse but, so far, have not received a response.
>
>--
>Bob Tinkelman          <bob@tink.com>
>ISPnet, Inc.    http://www.ispnet.net
>+1 (718) 464-4747  office
>+1 (800) 806-NETS  toll free
>+1 (718) 217-9407  fax