IT Services - Incident Management Simulation

SERVICE STATUS
Banner / Minerva
Email & Evault
myCourses
Content Mgt System
Network Connectivity
Telephony
uApply

 

ANNOUNCEMENTS

No problems to report at present.

 

SYSTEM STATUS RECORDING (3699/1)

All systems are up and operating normally

 

View IT System Admins Twitter Feed

Don’t know how to participate/role-play in the simulation?  Please see the RULES below.

Still puzzled?  Please contact the Gamemaster

Today’s Gamemaster

Simon Fulleringer, local 3609

simon.fulleringer@mcgill.ca

RULES OF THE SIMULATION:

Reference documents in CenterStage:

DOs

Things you CAN do for real

Caveat

Contact the Service Desk

Contact NCS Operations

Contact any other person or unit in IT Services

in person, by phone, or instant message

Introduce the conversation by making it clear that this is part of a test (incident simulation).

Contact any person or unit in IT Services

by Email

Clearly marked **TEST**SIMULATION**

Hold ad hoc meetings (IT Services staff only) to discuss the incident and decide on actions

Ensure that everyone invited knows that this is a simulation exercise.

Create, categorize, and process HEAT tickets

Clearly marked **TEST**SIMULATION**

Issue SSAs

Clearly marked **TEST**SIMULATION**

Do NOT send notifications outside IT Services

Issue major incident notifications

Clearly marked **TEST**SIMULATION**

Issue RFCs

Clearly marked **TEST**SIMULATION**

Test a service directly (hands-on)

The results should show that all is normal because presumably in reality there is no problem.

To advance the simulation: Call the Gamemaster , who will tell you what you “should” be seeing in the context of the simulated incident.

Check the status of a service according to the /it page, channel announcements, system status line, or other published information

See grid “Current (simulated) public status information” at the top of this page, or call the Gamemaster.

Login to servers and databases

Examine their state

Inspect files and query data

Use monitoring tools

The results should show that all is normal because presumably in reality there is no problem.

To advance the simulation: Call the Gamemaster , who will tell you what you “should” be seeing in the context of the simulated incident.

CONFUSED/STUCK/FRUSTRATED?

Call the Gamemaster . He can give hints on understanding the simulation and on following the incident management process. Ideally he won’t need to help you actually diagnose/resolve the issue, but may do so if it appears the simulation exercise is at risk of stalling.

DON’TS

Things you MUST NOT DO for real

Instead you should

Update any public-facing information such as the /it webpage (traffic lights), WMS channels, system status line

Call the Gamemaster and inform him of what you would have done if this were a real incident.

Contact any person outside IT Services

Call the Gamemaster and inform him of what you would have done if this were a real incident.

If appropriate, the Gamemaster will provide you with a simulated response.

Take any action which would affect live services, e.g. restart a service or database, reboot a server, change configuration information, modify data

Call the Gamemaster and inform him of what you would have done in the case of a real incident.

The Gamemaster will inform you of the immediate outcome, e.g. that the service or database restarted successfully (or not). He will NOT automatically tell you anything about the outcome as regards restoration of services.

This page: https://webforms.mcgill.ca/it-simulation/

Simulation date:                      May 1, 2014

This page updated:                  March 20, 3:20 PM