OUR SUCCESS STORIES
Following are few examples of performance problems that conicit found on time. Thanks to conicit early detection of performance anomalies, those problems didn’t degrade the service for the users.
Recognizing 100 % CPU
Different customers reported to us that only thanks to Conic IT they recognized situations in which jobs entered a loop (100% CPU), and caused huge load on their system. It’s very hard to find these cases without Conic IT, because monitors like TMON, Omegamon, Mainview, and others only show the total-CPU-time since the job started (often it’s since IPL). So it’s hard for these monitors and for performance people to understand the real-time CPU at every moment. However, Conic IT calculates the real-time CPU based on the Delta of each job from minute to minute (the amount of CPU time that was changed during each minute). Thus, Conic IT recognizes the real-time CPU, stores its history, and makes statistics about the typical real-time-CPU consumption of each job at each day and hour. Conic IT recognized many situations of jobs with too-high CPU. In some of these cases the situations were extreme, and without Conic IT it could crash the mainframe.
iDENTIFYing db2 PROBLEMS
There were many situations in which ConicIT sent alerts about DB2 errors and DB2 locks, which helped to identify DB2 problems on time and fix them.
identifying high compression level
In one of the cases, thanks to ConicIT some bank recognized a problem in the procedure of sending data to a 3rd party. That bank has a periodical transfer of huge files to some insurance companies. The bank IT never noticed it, but this transfer caused a huge CPU consumption whenever activated. Thanks to ConicIT they found it, investigated it and quickly found that they used a too high compression level before sending these files, which is very Expensive in CPU. So they reduced the compression level and the problem was solved.
recognizing long TClasses queues
ConicIT recognized many situations of long TClasses queues. The recognitions were done at early stages, and prevented bigger problems that could be developed out of it.
identifying unaware problems
Many situations that companies recognized problems that they weren’t aware of and took a lot of CPU resources. Including situations that they failed to keep their SLA with huge gaps, and didn’t know about it.
recognising sudden drops
ConicIT immediately recognized situations in which some specific important transaction had a sudden drop in transaction-rate (sometimes even to 0!!). Similar cases were recognized for too- high transaction rates and transactions CPU.
recognising high paging
Situations that ConicIT recognized jobs that needed too much paging (of virtual memory).
recognising CICS malfunctions
Few situations of malfunctioning CICS, which were found thanks to alerts from ConicIT. The alerts were about big difference in the number of terminals attached to different CICSes that are responsible for the same tasks and supposed to have similar number of terminals attached.
Memory conditions of
Memory condition of CICS – We know about at least one case that ConicIT recognized a quickly rising memory consumption of CICS at some bank, due to a memory leak. ConicIT recognized it at an early stage (around 30-40% memory consumption, which was already abnormal for that CICS). Thanks to ConicIT early alert that bank succeed to fix the problem before the CICS crashed (it was already around 90% when they fixed it).
IDENTIFying RESOURCE OVERLOAD
At few situations that we know of, ConicIT helped to identify situations of resource overload. For example, when CICS transactions-rate was too low because of DB2 problem, ConicIT sent alert about too-low-transactions-rate and also showed that many transactions were waiting for the same resource. That’s thanks to ConicIT not only following and studying the mainframe behavior, but also creating meaningful aggregations and calculations (in this case – aggregation of the number of transactions that are waiting for each resource).
recognizing a server was down
In a national branch of Mastercard, only thanks to ConicIT alert, they recognized a situation that their internet connection was too slow. In response they spoke with their internet service provider and found that actually one of two main servers of that internet service provider was down and nobody knew about it!… So Mastercard actually were the first ones to recognize the problem of that big internet service provider (thanks to ConicIT).
identifying db2 problems PART 2
Recognition of DB2 problems based on specific DB2 metrics (for example number of timeouts and number of read/write failed).
recognising
recognising too high situations
Situations of too high LPAR LCPD and PCPD cpu%
recognising poor queue managment
Situations of MQ queues that were disabled, and also situations that too many threads were connected to the same queue manager.
recognising long respons time
Situations with too long response times for important specific transactions, and situations with too long response times for specific transactions over all of the dynamic CICSes.