Date   

Docker Hub Rate Limit issue RESOLVED (was: [vpp-dev] Jenkins jobs UNSTABLE due to failure to upload logs to nexus.fd.io)

Dave Wallace
 

Folks,

The Docker Hub Rate Limit issue has been resolved (details below) and the FD.io CI jobs are operating normally.  Please let me know if you encounter any errant failure signatures.

Thanks to Vanessa & Trishan for their help resolving the outage.

Cheers,
-daw-

---- %< ----
Docker Hub Rate Limit Resolution:

It turns out I misunderstood their rate limiting scheme -- the limit is imposed on anonymous & unauthorized docker id based pull requests, not on the repository accounts. Therefore we needed to create an authenticated account, add it to the 'fdiotools' 'users' team and then configure Nomad to login with the docker id for pull requests from the 'fdiotools' repositories in order to avoid the rate limit.

Vanessa & Trishan created an fd.io email account and docker account which were then added to all of the Jenkins.fd.io Nomad Plugin configuration templates for all FD.io projects.  Nomad is now successfully issuing docker pull requests and spinning up CI job executors at the request of jenkins.fd.io!

Life is good :)
---- %< ----

On 11/18/2020 6:43 PM, Dave Wallace via lists.fd.io wrote:
Folks,

IT-21051 was resolved by Vanessa's ci-management patch [0] while [nearly] simultaneously two patches [1] [2] from Andrew Y were deployed which remove the artifact publishing from the VPP CI jobs.  These changes were subsequently reverted [3].

Operation of VPP CI jobs has been restored and I have done a 'recheck' on all gerrit changes which previously failed due to the UNSTABLE job completion status.

Unfortunately, there is a new issue caused by hitting the Docker Hub Pull limit [4] which is causing job allocations to fail and the jenkins build queue to back up.  I have opened a new LF Help Desk Ticket [4], sent an email to the TSC, and will bring this up in tomorrow's TSC meeting to get it resolved.

There also appears to be a similar issue with the vpp-csit-verify-device-master-1n-skx job which has jobs failing due to the inability to start containers.

Thank you for your patience during this outage and thanks to Vanessa & the entire LF-IT team who worked on identifying the fix to the log upload issue.  Also a big thank you to Andrew Yourtchenko for his assistance in pushing ci-management patches and Vratko for ci-management patch reviews.

-daw-

[0] https://gerrit.fd.io/r/c/ci-management/+/29986
[1] https://gerrit.fd.io/r/c/ci-management/+/29985
[2]
https://gerrit.fd.io/r/c/ci-management/+/29987
[3] https://gerrit.fd.io/r/c/ci-management/+/29988
[4] https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21063

On 11/17/2020 12:38 PM, Dave Wallace via lists.fd.io wrote:
Folks,

There is an issue with CI jobs being marked as UNSTABLE due to the failure to upload log files to nexus.fd.io.  This is causing the CI job pipeline to be stalled due to checkstyle job not succeeding.

I have opened a case with LF-IT: https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21051

Thanks,
-daw-







Re: FD.io Nomad Infrastructure hitting Docker Hub Rate Limit

Trishan de Lanerolle
 

Hi Jim/Dave,
I'm not aware of the status of the discussion. Will inquire on progress.  
Trishan


On Wed, Nov 18, 2020 at 7:37 PM St Leger, Jim <jim.st.leger@...> wrote:

This new Docker issue has come up more than a few times in more than a few open source places. It was my understanding that the LF was trying to work with Docker to solve this for all of their open source projects, inclusive of LFN/FD.io. If that doesn’t happen then, to Dave’s point, fdiotools will have to go through the singular application to Docker and ask for the open source exemption.

 

Does anyone from the LF have any visibility into where that work stands? (Trishan perhaps?)

 

Jim

 

From: tsc@... <tsc@...> On Behalf Of Dave Wallace
Sent: Wednesday, November 18, 2020 5:12 PM
To: infra-steering@...; tsc@...
Subject: [tsc] FD.io Nomad Infrastructure hitting Docker Hub Rate Limit

 

Folks,

FYI, today the FD.io Nomad CI has been starved of build executors due to hitting the Docker Hub Rate Limit [0] while using the 'fdiotools' account on docker hub.

I have opened an LF Help Desk Ticket [1] requesting that the 'fdiotools' account be upgraded either via Docker's Open Source Application process [2] or by upgrading the account to 'Pro' which costs $5 a month which will eliminate the issue ASAP.  Unfortunately until this is resolved there is likely to be daily outages of the FD.io CI pipeline.

Thanks,
-daw-

[0] https://www.docker.com/increase-rate-limits
[1] https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21063
[2] https://www.docker.com/community/open-source/application






--
Trishan R. de Lanerolle
Technical Program Manager & Community Architect 
Networking, Linux Foundation
voice: +1.203.699.6401
skype: tdelanerolle


Re: FD.io Nomad Infrastructure hitting Docker Hub Rate Limit

St Leger, Jim
 

This new Docker issue has come up more than a few times in more than a few open source places. It was my understanding that the LF was trying to work with Docker to solve this for all of their open source projects, inclusive of LFN/FD.io. If that doesn’t happen then, to Dave’s point, fdiotools will have to go through the singular application to Docker and ask for the open source exemption.

 

Does anyone from the LF have any visibility into where that work stands? (Trishan perhaps?)

 

Jim

 

From: tsc@... <tsc@...> On Behalf Of Dave Wallace
Sent: Wednesday, November 18, 2020 5:12 PM
To: infra-steering@...; tsc@...
Subject: [tsc] FD.io Nomad Infrastructure hitting Docker Hub Rate Limit

 

Folks,

FYI, today the FD.io Nomad CI has been starved of build executors due to hitting the Docker Hub Rate Limit [0] while using the 'fdiotools' account on docker hub.

I have opened an LF Help Desk Ticket [1] requesting that the 'fdiotools' account be upgraded either via Docker's Open Source Application process [2] or by upgrading the account to 'Pro' which costs $5 a month which will eliminate the issue ASAP.  Unfortunately until this is resolved there is likely to be daily outages of the FD.io CI pipeline.

Thanks,
-daw-

[0] https://www.docker.com/increase-rate-limits
[1] https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21063
[2] https://www.docker.com/community/open-source/application


FD.io Nomad Infrastructure hitting Docker Hub Rate Limit

Dave Wallace
 

Folks,

FYI, today the FD.io Nomad CI has been starved of build executors due to hitting the Docker Hub Rate Limit [0] while using the 'fdiotools' account on docker hub.

I have opened an LF Help Desk Ticket [1] requesting that the 'fdiotools' account be upgraded either via Docker's Open Source Application process [2] or by upgrading the account to 'Pro' which costs $5 a month which will eliminate the issue ASAP.  Unfortunately until this is resolved there is likely to be daily outages of the FD.io CI pipeline.

Thanks,
-daw-

[0] https://www.docker.com/increase-rate-limits
[1] https://jira.linuxfoundation.org/plugins/servlet/theme/portal/2/IT-21063
[2] https://www.docker.com/community/open-source/application


Thursday call

Joel Halpern
 

Unless there is something urgent, I will not make the call.  The IETF meeting is midnight to 6am US EST all week.

Yours,

Joel


FD.io - JIRA Maintenance 2020-11-23 at 0300 UTC to 0600 UTC

Vanessa Valderrama
 

What:  JIRA maintenance to migrate users from LDAP to Internal directory

When:   2020-11-23 at 0300 UTC to 0600 UTC

Impact:  JIRA will be unavailable during this time

Why:  This change is related to the JIRA Auth0 migration

Thank you,
Vanessa


[pma_tools-dev] PMA_TOOLS Project Status

Vanessa Valderrama
 

We have approval to archive the PMA_TOOLS project. I've added this topic to the agenda to vote on Thursday.

Thank you,

Vanessa



-------- Forwarded Message --------
Subject: RE: [pma_tools-dev] PMA_TOOLS Project Status
Date: Tue, 10 Nov 2020 22:40:18 +0000
From: Tkachuk, Georgii <georgii.tkachuk@...>
To: Vanessa Valderrama <vvalderrama@...>, pma_tools-dev@... <pma_tools-dev@...>


That works, we can archive it.
-Georgii

-----Original Message-----
From: Vanessa Valderrama <vvalderrama@...> Sent: Tuesday, November 10, 2020 3:40 PM
To: Tkachuk, Georgii <georgii.tkachuk@...>; pma_tools-dev@...
Subject: Re: [pma_tools-dev] PMA_TOOLS Project Status

Yes it can.

On 11/10/20 4:38 PM, Tkachuk, Georgii wrote:
Can it be undone at some point in case we need to?
-Georgii

-----Original Message-----
From: pma_tools-dev@... <pma_tools-dev@...> On Behalf Of Vanessa Valderrama
Sent: Tuesday, November 10, 2020 3:38 PM
To: Tkachuk, Georgii <georgii.tkachuk@...>; pma_tools-dev@...
Subject: Re: [pma_tools-dev] PMA_TOOLS Project Status

When we archive the project we set the repository to read only, stop the GitHub replication and remove the Jenkins Jobs.

Thank you,

Vanessa

On 11/10/20 4:32 PM, Tkachuk, Georgii wrote:
Hi Vanessa, we haven't maintained in a while, but it is there for people to use. What does archiving it entail?
-Georgii

-----Original Message-----
From: pma_tools-dev@... <pma_tools-dev@...> On Behalf Of Vanessa Valderrama
Sent: Tuesday, November 10, 2020 3:28 PM
To: pma_tools-dev@...
Subject: [pma_tools-dev] PMA_TOOLS Project Status

Is the PMA_TOOLS project still active or can it be archived?

Thank you,

Vanessa


CANCELED - Re: FDIO Gerrit Maintenance - 2020-11-04 at 1700 UTC to 1900 UTC

Vanessa Valderrama
 

This maintenance is canceled. The Gerrit resize was complete today during the Jenkins restart.

Thank you,

Vanessa

On 10/29/20 11:51 AM, Vanessa Valderrama wrote:

Correction

When:   2020-11-04 at 1700 UTC to 1900 UTC

Thank you,
Vanessa


On 10/28/20 12:12 PM, Vanessa Valderrama wrote:

What:  Gerrit maintenance to resize the instance

When:   2020-10-04 at 1700 UTC to 1900 UTC

Impact:  Gerrit and Jenkins will be unavailable at this time. Jenkins will be placed in shutdown mode at 1700 UTC. At 1800 UTC jobs will be aborted

Why:  This increase is being done at the request of the FD.io community

Thank you,
Vanessa


Re: FD.io Nomad Issue

Vanessa Valderrama
 

Jenkins has been restarted and jobs are running again.

We got approval to the Gerrit resize at the time so next week's
maintenance will be cancelled.

Thank you,

Vanessa

On 10/29/20 1:33 PM, Vanessa Valderrama wrote:
The community has requested a restart of Jenkins. We're placing Jenkins
in shutdown mode to prepare for the restart.

Thank you,

Vanessa

On 10/29/20 11:26 AM, Vanessa Valderrama wrote:
Nomad executors are not starting in Jenkins. This was due to the DNS for
the Nomad URL in Jenkins which is configured to use
nomad.fdiopoc.net:4646 pointing to the wrong IP address.

; <<>> DiG 9.11.14-RedHat-9.11.14-2.fc30 <<>> nomad.fdiopoc.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22624
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nomad.fdiopoc.net.        IN    A
;; ANSWER SECTION:
nomad.fdiopoc.net.    180    IN    A    157.230.67.179
;; Query time: 38 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 29 09:55:49 CDT 2020
;; MSG SIZE  rcvd: 62

We tried hard-coding the Nomad URL to the IP address 10.30.51.32:4646
and 10.39.51.33:4646. Unfortunately that is not resolving the issue.

We will continue to work with the community to resolve this issue as
quickly as possible.

Thank you,
Vanessa


Re: FD.io Nomad Issue

Vanessa Valderrama
 

The community has requested a restart of Jenkins. We're placing Jenkins
in shutdown mode to prepare for the restart.

Thank you,

Vanessa

On 10/29/20 11:26 AM, Vanessa Valderrama wrote:
Nomad executors are not starting in Jenkins. This was due to the DNS for
the Nomad URL in Jenkins which is configured to use
nomad.fdiopoc.net:4646 pointing to the wrong IP address.

; <<>> DiG 9.11.14-RedHat-9.11.14-2.fc30 <<>> nomad.fdiopoc.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22624
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nomad.fdiopoc.net.        IN    A
;; ANSWER SECTION:
nomad.fdiopoc.net.    180    IN    A    157.230.67.179
;; Query time: 38 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 29 09:55:49 CDT 2020
;; MSG SIZE  rcvd: 62

We tried hard-coding the Nomad URL to the IP address 10.30.51.32:4646
and 10.39.51.33:4646. Unfortunately that is not resolving the issue.

We will continue to work with the community to resolve this issue as
quickly as possible.

Thank you,
Vanessa


Re: FDIO Gerrit Maintenance - 2020-11-04 at 1700 UTC to 1900 UTC

Vanessa Valderrama
 

Correction

When:   2020-11-04 at 1700 UTC to 1900 UTC

Thank you,
Vanessa


On 10/28/20 12:12 PM, Vanessa Valderrama wrote:

What:  Gerrit maintenance to resize the instance

When:   2020-10-04 at 1700 UTC to 1900 UTC

Impact:  Gerrit and Jenkins will be unavailable at this time. Jenkins will be placed in shutdown mode at 1700 UTC. At 1800 UTC jobs will be aborted

Why:  This increase is being done at the request of the FD.io community

Thank you,
Vanessa


FD.io Nomad Issue

Vanessa Valderrama
 

Nomad executors are not starting in Jenkins. This was due to the DNS for
the Nomad URL in Jenkins which is configured to use
nomad.fdiopoc.net:4646 pointing to the wrong IP address.

; <<>> DiG 9.11.14-RedHat-9.11.14-2.fc30 <<>> nomad.fdiopoc.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22624
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nomad.fdiopoc.net.        IN    A
;; ANSWER SECTION:
nomad.fdiopoc.net.    180    IN    A    157.230.67.179
;; Query time: 38 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Oct 29 09:55:49 CDT 2020
;; MSG SIZE  rcvd: 62

We tried hard-coding the Nomad URL to the IP address 10.30.51.32:4646
and 10.39.51.33:4646. Unfortunately that is not resolving the issue.

We will continue to work with the community to resolve this issue as
quickly as possible.

Thank you,
Vanessa


Re: Ole Troan to be my proxy at this weeks FD.io TSC

otroan@...
 

The meeting today is likely at 1500UTC.
May I suggest we consider anchoring FD.io meetings to UTC. Then we don't have to deal with random changes of time in local jurisdictions.

Cheers,
Ole

On 28 Oct 2020, at 14:48, Ed Warnicke <hagbard@...> wrote:

Ole Troan has graciously agreed to be my proxy at this week's FD.io TSC meeting.

Ed


Regrets

Joel Halpern
 

I had hoped my current meeting would end in time, but no, I can not join the TSC acll today.

Sorry,

Joel


FDIO Gerrit Maintenance - 2020-10-04 at 1700 UTC to 1900 UTC

Vanessa Valderrama
 

What:  Gerrit maintenance to resize the instance

When:   2020-10-04 at 1700 UTC to 1900 UTC

Impact:  Gerrit and Jenkins will be unavailable at this time. Jenkins will be placed in shutdown mode at 1700 UTC. At 1800 UTC jobs will be aborted

Why:  This increase is being done at the request of the FD.io community

Thank you,
Vanessa


Ray Kinsella to chair this weeks FD.io TSC meeting

Edward Warnicke
 

Ray Kinsella has graciously agreed to chair the FD.io TSC meeting this week.

Ed


Ole Troan to be my proxy at this weeks FD.io TSC

Edward Warnicke
 

Ole Troan has graciously agreed to be my proxy at this week's FD.io TSC meeting.

Ed


Re: Please approve Vladimir Lavor as a new GoVPP project committer

Ray Kinsella
 

Thanks Rastislav,

 

We will pick this up at the next TSC meeting.

 

Regards,

 

Ray K

 

From: tsc@... <tsc@...> On Behalf Of Rastislav Szabo -X (raszabo - PANTHEON TECH SRO at Cisco) via lists.fd.io
Sent: Thursday 22 October 2020 20:59
To: tsc@...
Cc: govpp-dev@...
Subject: [tsc] Please approve Vladimir Lavor as a new GoVPP project committer

 

Dear FD.io TSC,

 

I would like to ask for approving Vladimir Lavor as a new GoVPP project committer.

 

The supermajority of GoVPP committers already voted +1:

 

https://lists.fd.io/g/govpp-dev/topic/new_govpp_committer/77726308?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,77726308

 

Vladimir has been contributing to GoVPP since 2018 and has been recently one of the top code contributors into GoVPP:

 

https://gerrit.fd.io/r/gitweb?p=govpp.git;a=search;s=Vladimir+Lavor;st=author

 

Thanks,

Rastislav


Please approve Vladimir Lavor as a new GoVPP project committer

Rastislav Szabo -X (raszabo - PANTHEON TECH SRO at Cisco) <raszabo@...>
 

Dear FD.io TSC,

 

I would like to ask for approving Vladimir Lavor as a new GoVPP project committer.

 

The supermajority of GoVPP committers already voted +1:

 

https://lists.fd.io/g/govpp-dev/topic/new_govpp_committer/77726308?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,77726308

 

Vladimir has been contributing to GoVPP since 2018 and has been recently one of the top code contributors into GoVPP:

 

https://gerrit.fd.io/r/gitweb?p=govpp.git;a=search;s=Vladimir+Lavor;st=author

 

Thanks,

Rastislav


GoVPP PTL resignation

Rastislav Szabo -X (raszabo - PANTHEON TECH SRO at Cisco) <raszabo@...>
 

Dear GoVPP community,

 

since I have accepted a new career challenge, I’ve decided to resign from my GoVPP project tech lead position.

 

Although I would still like to retain my GoVPP committer status, I cannot commit to PTL role anymore.

 

According to the FD.io governance document,

https://fd.io/docs/tsc/FD.IO-Technical-Community-Document-12-12-2017.pdf: 

 

3.2.3.1 Project Technical Leader Candidates

Candidates for the project’s PTL will be derived from the Committers of the Project. Candidates must self-nominate.

 

I'd like to invite any interested GoVPP committer to self-nominate for the PTL role. Please email your self-nomination to the govpp-dev mailing list.

 

Let's close the self-nomination period by Tuesday 27th October 20:00 UTC.

 

Thanks,

Rasto