ConicIT Success stories
Following are few examples of performance problems that ConicIT found on time. Thanks to ConicIT early detection of performance anomalies, those problems didn’t degrade the service for the users.
- Different customers reported to us that only thanks to ConicIT they recognized situations in which jobs entered a loop (100% CPU), and caused huge load on their system. It's very hard to find these cases without ConicIT, because monitors like TMON, Omegamon, Mainview, and others only show the total-CPU-time since the job started (often it's since IPL). So it's hard for these monitors and for performance people to understand the real-time CPU at every moment. However, ConicIT calculates the real-time CPU based on the Delta of each job from minute to minute (the amount of CPU time that was changed during each minute). Thus, ConicIT recognizes the real-time CPU, stores its history, and makes statistics about the typical real-time-CPU consumption of each job at each day and hour. ConicIT recognized many situations of jobs with too-high CPU. In some of these cases the situations were extreme, and without ConicIT it could crash the mainframe.
- Memory condition of CICS - We know about at least one case that ConicIT recognized a quickly rising memory consumption of CICS at some bank, due to a memory leak. ConicIT recognized it at an early stage (around 30-40% memory consumption, which was already abnormal for that CICS). Thanks to ConicIT early alert that bank succeed to fix the problem before the CICS crashed (it was already around 90% when they fixed it).
- At few situations that we know of, ConicIT helped to identify situations of resource overload. For example, when CICS transactions-rate was too low because of DB2 problem, ConicIT sent alert about too-low-transactions-rate and also showed that many transactions were waiting for the same resource. That’s thanks to ConicIT not only following and studying the mainframe behavior, but also creating meaningful aggregations and calculations (in this case – aggregation of the number of transactions that are waiting for each resource).
- There were many situations in which ConicIT sent alerts about DB2 errors and DB2 locks, which helped to identify DB2 problems on time and fix them.
- In one of the cases, thanks to ConicIT some bank recognized a problem in the procedure of sending data to a 3rd party. That bank has a periodical transfer of huge files to some insurance companies. The bank IT never noticed it, but this transfer caused a huge CPU consumption whenever activated. Thanks to ConicIT they found it, investigated it and quickly found that they used a too high compression level before sending these files, which is very expensive in CPU. So they reduced the compression level and the problem was solved
- In a national branch of Mastercard, only thanks to ConicIT alert, they recognized a situation that their internet connection was too slow. In response they spoke with their internet service provider and found that actually one of two main servers of that internet service provider was down and nobody knew about it!... So Mastercard actually were the first ones to recognize the problem of that big internet service provider (thanks to ConicIT).
- ConicIT recognized many situations of long TClasses queues. The recognitions were done at early stages, and prevented bigger problems that could be developed out of it.
- Recognition of DB2 problems based on specific DB2 metrics (for example number of timeouts and number of read/write failed).
- Many situations that companies recognized problems that they weren’t aware of and took a lot of CPU resources. Including situations that they failed to keep their SLA with huge gaps, and didn’t know about it.
- ConicIT recognize at early stage many situations in which bugs led to too high consumption of CPU by CICS-PLEXes.
- ConicIT immediately recognized situations in which some specific important transaction had a sudden drop in transaction-rate (sometimes even to 0!!). Similar cases were recognized for too-high transaction rates and transactions CPU.
- Recognitions of cics-abends
- Situations that ConicIT recognized jobs that needed too much paging (of virtual memory)
- Situations of too high LPAR LCPD and PCPD cpu%
- Few situations of malfunctioning CICS, which were found thanks to alerts from ConicIT. The alerts were about big difference in the number of terminals attached to different CICSes that are responsible for the same tasks and supposed to have similar number of terminals attached.
- Situations of MQ queues that were disabled, and also situations that too many threads were connected to the same queue manager.
- Situations that some tasks didn’t finish even after few minutes, and consumed high CPU for long time.
- Situations with too long response times for important specific transactions, and situations with too long response times for specific transactions over all of the dynamic CICSes