Date   

Re: FDIO Maintenance - 2020-02-20 1900 UTC to 2400 UTC

Vanessa Valderrama
 

Maintenance Reminder

On 2/12/20 10:01 AM, Vanessa Valderrama wrote:
Maintaince has been moved to February 20th

What: Standard updates and upgrade
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-25 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


On 1/7/20 8:30 AM, Vanessa Valderrama wrote:
Please let us know as soon as possible if this maintenance conflicts with your project.

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-05 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


FDIO Maintenance - 2020-02-20 1900 UTC to 2400 UTC

Vanessa Valderrama
 

Maintaince has been moved to February 20th


What: Standard updates and upgrade
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-25 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


On 1/7/20 8:30 AM, Vanessa Valderrama wrote:
Please let us know as soon as possible if this maintenance conflicts with your project.

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-05 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


Re: FDIO Maintenance - 2020-02-25 1900 UTC to 2400 UTC

Vanessa Valderrama
 

We have moved the maintenance window to prevent interference with 20.01 release

What: Standard updates and upgrade
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-25 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


On 1/7/20 8:30 AM, Vanessa Valderrama wrote:
Please let us know as soon as possible if this maintenance conflicts with your project.

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-05 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


Re: [csit-dev] FDIO Maintenance - 2020-02-05 1900 UTC to 2400 UTC

Maciek Konstantynowicz (mkonstan)
 

Hi Vanessa,

Many thanks for the heads-up.

Unfortunately 05-Feb hits is in the middle of CSIT-2001 release, per published schedule here: https://wiki.fd.io/view/CSIT/csit2001_plan

Is there any chance to postpone this upgrade till after 19-Feb, which is +1 week after our report publish target in case we slip due to infra issues (we got affected by more SSD failures as you may know).

Pls let us know if this is possible.

Regards,
-Maciek

On 7 Jan 2020, at 14:30, Vanessa Valderrama <vvalderrama@...> wrote:





Please let us know as soon as possible if this maintenance conflicts with your project.

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-05 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#3866): https://lists.fd.io/g/csit-dev/message/3866
Mute This Topic: https://lists.fd.io/mt/69502829/675185
Group Owner: csit-dev+owner@...
Unsubscribe: https://lists.fd.io/g/csit-dev/unsub  [mkonstan@...]
-=-=-=-=-=-=-=-=-=-=-=-


FDIO Maintenance - 2020-02-05 1900 UTC to 2400 UTC

Vanessa Valderrama
 

Please let us know as soon as possible if this maintenance conflicts with your project.

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.204.1
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2020-02-05 1900 UTC to 2400 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


Re: FD.io Jenkins/OpenStack Outage

Vanessa Valderrama
 

A fix has been implemented. Services have been restored. If you
experience any further issues, please open a ticket at
support.linuxfoundation.org.

Thank you,

Vanessa

On 1/6/20 1:03 PM, Vanessa Valderrama wrote:
Our OpenStack cloud provider is having issues with network controller
and that affects our CI OpenStack infrastructure accross all projects.
We're working with the provider to fix the issue as quickly as possible.

Please feel free to check the status page for additional updates.

https://status.linuxfoundation.org/incidents/g22zdrl0vrfd

Thank you,
Vanessa



FD.io Jenkins/OpenStack Outage

Vanessa Valderrama
 

Our OpenStack cloud provider is having issues with network controller
and that affects our CI OpenStack infrastructure accross all projects.
We're working with the provider to fix the issue as quickly as possible.

Please feel free to check the status page for additional updates.

https://status.linuxfoundation.org/incidents/g22zdrl0vrfd

Thank you,
Vanessa


Re: FD.io Production Jenkins - Restart Required

Vanessa Valderrama
 

Jenkins is back up. Jobs have started. Ed and I are still trying to
resolve an issue with a few of the CSIT hourly jobs. Please open a
ticket at support.linuxfoundation.org if you have any issues.

Thank you,
Vanessa

On 12/16/19 2:10 PM, Vanessa Valderrama wrote:
We're making some Nomad changes as well which is causing a bit of delay
in the restart. We should be done within about 30 minutes.

Thank you,

Vanessa


On 12/16/19 12:54 PM, Vanessa Valderrama wrote:
We've placed Jenkins in shutdown mode to allow for a restart. We need to
uninstall the OpenStack Single-Use Slave plugin. The plugin is no longer
required and we believe it's causing an issue that prevents the
OpenStack slaves from being removed in Jenkins.

We'll do the restart at 20:00 UTC.

The restart will take less than 5 minutes.

Thank you,
Vanessa


Re: FD.io Production Jenkins - Restart Required

Vanessa Valderrama
 

We're making some Nomad changes as well which is causing a bit of delay
in the restart. We should be done within about 30 minutes.

Thank you,

Vanessa

On 12/16/19 12:54 PM, Vanessa Valderrama wrote:
We've placed Jenkins in shutdown mode to allow for a restart. We need to
uninstall the OpenStack Single-Use Slave plugin. The plugin is no longer
required and we believe it's causing an issue that prevents the
OpenStack slaves from being removed in Jenkins.

We'll do the restart at 20:00 UTC.

The restart will take less than 5 minutes.

Thank you,
Vanessa


FD.io Production Jenkins - Restart Required

Vanessa Valderrama
 

We've placed Jenkins in shutdown mode to allow for a restart. We need to
uninstall the OpenStack Single-Use Slave plugin. The plugin is no longer
required and we believe it's causing an issue that prevents the
OpenStack slaves from being removed in Jenkins.

We'll do the restart at 20:00 UTC.

The restart will take less than 5 minutes.

Thank you,
Vanessa


FD.io - Network Issues

Vanessa Valderrama
 

There appears to be a Network Routing Issue with an upstream provider at our PDX Data-center.  This affects some core infrastructure availability.  Kernel.org and PDX mirrors are affected as well as some CAF properties (wiki) and other services.

We have a workaround in place to restore service to OpenStack builders in the Vexxhost infrastructure.

FD.io systems affected

  • git.fd.io
  • FD.io lab machines

After a router reboot, the router is still showing down. Vexxhost is working on resolving the issue.

For updates please check
https://status.linuxfoundation.org

Thank you,
Vanessa


Re: FD.io Jenkins Maintenance: 2019-12-10 1900 UTC to 2200 UTC

Vanessa Valderrama
 

Maintenance is complete. All systems are available. Please open a ticket at support.linuxfoundation.org if you experience any issues.

Thank you,
Anton & Vanessa


On 12/10/19 1:04 PM, Vanessa Valderrama wrote:

Starting maintenance

On 12/10/19 7:15 AM, Vanessa Valderrama wrote:

Jenkins sandbox is complete. Jenkins production will be shutdown at 1800 UTC in preparation for maintenance.

Thanks,
Vanessa


On 12/3/19 9:57 AM, Vanessa Valderrama wrote:

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.190.3
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2019-12-10 1900 UTC to 2200 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


Re: FD.io Jenkins Maintenance: 2019-12-10 1900 UTC to 2200 UTC

Vanessa Valderrama
 

Starting maintenance

On 12/10/19 7:15 AM, Vanessa Valderrama wrote:

Jenkins sandbox is complete. Jenkins production will be shutdown at 1800 UTC in preparation for maintenance.

Thanks,
Vanessa


On 12/3/19 9:57 AM, Vanessa Valderrama wrote:

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.190.3
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2019-12-10 1900 UTC to 2200 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


Re: FD.io Jenkins Maintenance: 2019-12-10 1900 UTC to 2200 UTC

Vanessa Valderrama
 

Jenkins sandbox is complete. Jenkins production will be shutdown at 1800 UTC in preparation for maintenance.

Thanks,
Vanessa


On 12/3/19 9:57 AM, Vanessa Valderrama wrote:

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.190.3
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2019-12-10 1900 UTC to 2200 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


Re: FD.io Jenkins Maintenance: 2019-12-10 1900 UTC to 2200 UTC

Vanessa Valderrama
 

Maintenance reminder

On 12/3/19 9:57 AM, Vanessa Valderrama wrote:

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.190.3
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2019-12-10 1900 UTC to 2200 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


FD.io Jenkins Maintenance: 2019-12-10 1900 UTC to 2200 UTC

Vanessa Valderrama
 

What:
  • Jenkins
    • OS and security updates
    • Upgrade to 2.190.3
    • Plugin updates
  • Nexus
    • OS updates
  • Jira
    • OS updates
  • Gerrit
    • OS updates
  • Sonar
    • OS updates
  • OpenGrok
    • OS updates
When:  2019-12-10 1900 UTC to 2200 UTC

Impact:

Maintenance will require a reboot of each FD.io system. Jenkins will be placed in shutdown mode at 1800 UTC. Please let us know if specific jobs cannot be aborted.
The following systems will be unavailable during the maintenance window:
  •     Jenkins sandbox
  •     Jenkins production
  •     Nexus
  •     Jira
  •     Gerrit
  •     Sonar
  •     OpenGrok


Re: FD.io Gerrit Restart Required

Vanessa Valderrama
 

Gerrit restart is complete. Jenkins has been taken out of shutdown mode.

Thank you,
Vanessa


On 11/6/19 2:30 PM, Vanessa Valderrama wrote:

What:

LF is setting up a local Gerrit mirror to resolve the Gerrit cloning timeout errors that have been causing intermittent job failures. We'll need to restart Gerrit for the replication settings to take affect.

Impact:

  • Jenkins will be placed in shutdown mode to allow verify and merge jobs to complete before the Gerrit restart
    • Jenkins will NOT be restarted and NO jobs will need to be terminated
  • Gerrit will be unavailable during the restart approximately 1 minute
Thank you,
Vanessa


FD.io Gerrit Restart Required

Vanessa Valderrama
 

What:

LF is setting up a local Gerrit mirror to resolve the Gerrit cloning timeout errors that have been causing intermittent job failures. We'll need to restart Gerrit for the replication settings to take affect.

Impact:

  • Jenkins will be placed in shutdown mode to allow verify and merge jobs to complete before the Gerrit restart
    • Jenkins will NOT be restarted and NO jobs will need to be terminated
  • Gerrit will be unavailable during the restart approximately 1 minute
Thank you,
Vanessa

---------------------------------------------------------------------------------------------------------------------------------------------

Status Update

Issue: Gateway Timeout Errors

  • Summary: Intermittent Gateway Timeout Errors on the ci-management-jjb-merge jobs are causing stability issues with Jenkins causing unplanned downtime
    • We have put in a change to take Nginx out of the picture and allow the build node to talk directly to Jenkins
    • We'll be monitoring closely to ensure this resolves the issue

Issue: Gerrit cloning timeouts

  • Summary: Intermittent job failures caused by a timeout when closing a Gerrit repo
    • We have opened a Vexxhost ticket for Vexxhost and Ed Kern to troubleshoot the latency within the network the Nomad cluster is on
    • We are also setting up a local Gerrit mirror which should help resolve/improve cloning - this should be complete by the end of the week

Issue: CSIT: s3-t21-sut1 (10.30.51.44) failure

  • Summary: The device s3-t21-sut1 device is having an SSH disk read only issue and is unreachable over NW
    • We've opened a Vexxhost ticket to check the machine

Issue: Hung jobs

  • Summary: Intermittent jobs stuck/hung requiring the job to be aborted
    • We believe this issue was resolved with the latest Jenkins upgrade

Please let me know if you need additional information. If you experience any hung jobs or gateway timeout errors, please open a ticket at support.linuxfoundation.org.


Re: [tsc] FD.io Production Jenkins Restart Required

Vanessa Valderrama
 

Status Update

Issue: Gateway Timeout Errors

  • Summary: Intermittent Gateway Timeout Errors on the ci-management-jjb-merge jobs are causing stability issues with Jenkins causing unplanned downtime
    • We have put in a change to take Nginx out of the picture and allow the build node to talk directly to Jenkins
    • We'll be monitoring closely to ensure this resolves the issue

Issue: Gerrit cloning timeouts

  • Summary: Intermittent job failures caused by a timeout when closing a Gerrit repo
    • We have opened a Vexxhost ticket for Vexxhost and Ed Kern to troubleshoot the latency within the network the Nomad cluster is on
    • We are also setting up a local Gerrit mirror which should help resolve/improve cloning - this should be complete by the end of the week

Issue: CSIT: s3-t21-sut1 (10.30.51.44) failure

  • Summary: The device s3-t21-sut1 device is having an SSH disk read only issue and is unreachable over NW
    • We've opened a Vexxhost ticket to check the machine

Issue: Hung jobs

  • Summary: Intermittent jobs stuck/hung requiring the job to be aborted
    • We believe this issue was resolved with the latest Jenkins upgrade

Please let me know if you need additional information. If you experience any hung jobs or gateway timeout errors, please open a ticket at support.linuxfoundation.org.

Thank you,
Vanessa

On 11/6/19 9:56 AM, Maciek Konstantynowicz (mkonstan) wrote:
Hi Vanessa, Thanks for the note. CSIT project keeps experiencing issues
due to Jenkins outages. Do you have ETA for the fix that will stop these
outages?

-Maciek

On 5 Nov 2019, at 23:18, Vanessa Valderrama <vvalderrama@...> wrote:

Jenkins has been restarted, job views restored, jobs are running.

We will continue to investigate the Gateway Timeout and JNLP errors
we've been seeing the last couple of days.

If you experience any issues, please open a ticket at
support.linuxfoundation.org

Thank you,
Vanessa


On 11/5/19 4:39 PM, Vanessa Valderrama wrote:
We continue having issues with Gateway Timeouts on the CI merge job which has
corrupted the Jenkins job views.

Jenkins will need to be restarted to resolve this issue.

Thank you,
Vanessa

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1152): https://lists.fd.io/g/tsc/message/1152
Mute This Topic: https://lists.fd.io/mt/42686762/675185
Group Owner: tsc+owner@...
Unsubscribe: https://lists.fd.io/g/tsc/unsub  [mkonstan@...]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [tsc] FD.io Production Jenkins Restart Required

Maciek Konstantynowicz (mkonstan)
 

Hi Vanessa, Thanks for the note. CSIT project keeps experiencing issues
due to Jenkins outages. Do you have ETA for the fix that will stop these
outages?

-Maciek

On 5 Nov 2019, at 23:18, Vanessa Valderrama <vvalderrama@...> wrote:

Jenkins has been restarted, job views restored, jobs are running.

We will continue to investigate the Gateway Timeout and JNLP errors
we've been seeing the last couple of days.

If you experience any issues, please open a ticket at
support.linuxfoundation.org

Thank you,
Vanessa


On 11/5/19 4:39 PM, Vanessa Valderrama wrote:
We continue having issues with Gateway Timeouts on the CI merge job which has
corrupted the Jenkins job views.

Jenkins will need to be restarted to resolve this issue.

Thank you,
Vanessa
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1152): https://lists.fd.io/g/tsc/message/1152
Mute This Topic: https://lists.fd.io/mt/42686762/675185
Group Owner: tsc+owner@...
Unsubscribe: https://lists.fd.io/g/tsc/unsub [mkonstan@...]
-=-=-=-=-=-=-=-=-=-=-=-