Recording accidents and near miss whether major or minor has long been a practice employed in industry as a way to mitigate and learn from circumstances that occur in the work place.
A well runs IT system should be no different. In a previous tip we discussed service outage analysis, which covers the major incidents (or accidents) that result in a period of downtime. To record the minor incidents and near misses you should keep a risk register. It should store the minimum required details such as the date it was first spotted, status, owner, service it relates to, risk and potential impact to service.
This information can form part of the service improvement, provide backing for any project work that requires expenditure and ensure you have done your job correctly by highlighting at the correct level the potential failings in the system should the worst happen.