Friday 23 December 2016

Primary Storage, Snapshots, Databases, Backup, and Archival.

Data in the enterprise comes in many forms. Simple flat files, transactional databases, scratch files, complex binary blobs, encrypted files, and whole block devices, and filesystem metadata. Simple flat files, such as documents, images, application and operating system files are by far the easiest to manage. These files can simply be scanned for access time to be sorted and managed for backup and archival. Some systems can even transparently symlink these files to other locations for archival purposes. In general, basic files in this category are opened and closed in rapid succession, and actually rarely change. This makes them ideal for backup as they can be copied as they are, and in the distant past, they were all that there was and that was enough.

Then came multitasking. With the introduction of multiple programs running in a virtual memory space, it became possible that files could be opened by two different applications at once. It became also possible that these locked files could be opened and changed in memory without being synchronized back to disk. So elaborate systems were developed to handle file locks, and buffers that flush their changes back to those files on a periodic or triggered basis. Databases in this space were always open, and could not be backed up as they were. Every transaction was logged to a separate set of files, which could be played back to restore the database to functionality. This is still in use today, as reading the entire database may not be possible, or performant in a production system. This is called a transaction log. Mail servers, database management systems, and networked applications all had to develop software programming interfaces to backup to a single string of files. Essentially this format is called Tape Archive (tar.)

Eventually and quite recently actually, these systems became so large and complex as to require another layer of interface with the whole filesystem, there were certain applications and operating system files that simply were never closed for copy. The concept of Copy on Write was born. The entire filesystem was essentially always closed, and any writes were written as an incremental or completely new file, and the old one was marked for deletion. Filesystems in this modern era progressively implemented more pure copy on write transaction based journaling so files could be assured intact on system failure, and could be read for archival, or multiple application access. Keep in mind this is a one paragraph summation of 25 years of filesystem technology, and not specifically applicable to any single filesystem.


Along with journaling, which allowed a system to retain filesystem integrity, there came an idea that the files could intelligently retain the old copies of these files, and the state of the filesystem itself, as something called a snapshot. All of this stems from the microcosm of databases applied to general filesystems. Again databases still need to be backed up and accessed through controlled methods, but slowly the features of databases find their way into operating systems and filesystems. Modern filesystems use shadow copies and snapshotting to allow rollback of file changes, complete system restore, and undeletion of files as long as the free space hasn’t been reallocated.

Which brings us to my next point which is the difference between a backup or archive, and a snapshot. A snapshot is a picture of what a disk used to be. This picture is kept on the same disk, and in the event of a physical media failure or overuse of the disk itself, is in totality useless. There needs to be sufficient free space on the disk to hold the old snapshots, and if the disk fails, all is still lost. As media redundancy is easily managed to virtually preclude failure, space considerations especially in aged or unmanaged filesystems, can easily get out of hand. The effect of a filesystem growing near to capacity is essentially a limitation of usable features. As time moves on, simple file rollback features will lose all effectiveness, and users will have to go to the backup to find replacements.

There are products and systems to automatically compress and move files that are unlikely to be accessed in the near future. These systems usually create a separate filesystem and replace your files with links to that system. This has the net effect of reducing the primary storage footprint, the backup load, and allowing your filesystem to grow effectively forever. In general, this is not such a good thing as it sounds, as the archive storage may still fill up, and you then have an effective filesystem that is larger than the maximum theoretical size, which will have to be forcibly pruned to ever restore properly. Also, your backup system, if the archive system is not integrated, probably will be unaware of the archive system. This would mean that the archived data would be lost in the event of a disaster or catastrophe.

Which brings about another point, whatever your backup vendor supports, you are effectively bound to use those products for the life of the backup system. This may be ten or more years and may impact business flexibility. Enterprise business systems backup products easily can cost dozens of thousands per year, and however flexible your systems need to be, so your must your backup vendor provide.

Long term planning and backup systems go hand in hand. Ideally, you should be shooting for a 7 or 12-year lifespan for these systems. They should be able to scale in features and load for the predicted curve of growth with a very wide margin for error. Conservatively, you should plan on a 25% data growth rate per year minimum. Generally speaking 50 to 100% is far more likely. Highly integrated backup systems truly are a requirement of Information Services, and while costly, failure to effectively plan for disaster or catastrophe will lead to and end of business continuity, and likely the continuity of your employment.


Jason Zhang is the product marketing person for Rocket Software's Backup, Storage, and Cloud solutions.

Tuesday 13 December 2016

The Best of Both Worlds Regarding Mainframe Storage and the Cloud

It might shock you to hear that managing data has never been more difficult than it is today. Data is growing at the speed of light, while IT budgets are shrinking at a similar pace. All of this growth and change is forcing administrators to find more relevant ways to successfully manage and store data. This is no easy task, as there are many regulatory constraints with respect to data retention, and the business value of the data needs to be considered as well. Those within the IT world likely remember (with fondness) the hierarchical storage management systems (HSM), which have traditionally played a key role in the mainframe information lifecycle management (ILM). Though this was once a reliable and effective way to manage company data, gone are the days when businesses can put full confidence in such a method. The truth of the matter is that things have become much more complicated.

There is a growing need to collect information and data, and the bad news with this is that there is simply not enough money in the budget to handle the huge load. In fact, not only are budgets feeling the growth, but even current systems can’t keep up with the vast pace of the increase in data and its value. It is estimated that global data center traffic will soon triple its numbers from 2013. You can imagine what a tremendous strain this quick growth poses to HSM and ILM. Administrators are left with loads of questions such as how long must data be kept, what data must be stored, what data is safe for deletion, and when it is safe to delete certain data. These questions are simply the tip of the iceberg when it comes to data management. Regulatory requirements, estimated costs, and the issues of backup, recovery and accessibility for critical data are areas of concern that also must be addressed with the changing atmosphere of tremendous data growth.

There is an alluring solution that has come on the scene that might make heads turn with respect to management of stored data. The idea of hybrid cloud storage is making administrators within the IT world and businesses alike think that there might actually be a way to manage this vast amount of data in a cost effective way. So, what would this hybrid cloud look like? Essentially, it would be a combination of capabilities found in both private and public cloud storage solutions. It would combine on-site company data with storage capabilities found on the public cloud. Why would this be a good solution? The reason is because companies are looking for a cost effective way to manage the massive influx of data. This hybrid cloud solution would offer just that. The best part is, users would only need to pay for what they use regarding their storage needs. The goods news is, the options are seemingly unlimited, increasing or decreasing as client needs shift over time. With a virtualized architecture in place, the variety of storage options are endless. Imagine what it would be like to no longer be worried about the provider or the type of storage you are managing. With the hybrid cloud storage system in place, these worries would go out the window. Think of it as commodity storage. Those within the business world understand that this type of storage has proven to work well within their spheres, ultimately offering a limitless capacity to meet all of their data storage needs. What could be better?

In this fast-paced, shifting world, it’s high time relevant solutions come to the forefront that are effective for the growth and change so common in the world of technology today. Keep in mind that the vast influx of data could become a huge problem if solutions such as the hybrid cloud options are not considered. This combination of cloud storage is a great way to lower storage costs as the retention time increases, and the data value decreases. With this solution, policies are respected, flexibility is gained, and costs are cut. When it comes to managing data effectively over time, the hybrid cloud storage system is a solution that almost anyone could get behind!

Jason Zhang is the product marketing person for Rocket Software's Backup, Storage, and Cloud solutions.



Cutting Edge IMS Database Management

Never before has the management of a database been more difficult for those within the IT world. This should not come as a shock to those reading this, especially when you consider how vast the data volumes and streams currently are. The unfortunate news is that these volumes and streams are not shrinking anytime soon, but IT budgets ARE, and so are things such as human resources and technical skills. The question remains...how are businesses supposed to manage these databases in the most effective way? Well, the very factors mentioned above make automation an extremely attractive choice.

Often times, clients have very specific requirements when it comes to automating their IMS systems. Concerns arise such as how to make the most of CPU usage, what capabilities are available, strategic advantages, and how to save with respect to OpEx. These are not simply concerns, but necessities, and factors that almost all clients consider. Generally speaking, these requirements can be streamlined into 2 main categories. The first is how to monitor database exceptions and the second is how to implement conditional reorgs.

Regarding the monitoring of database exceptions, clients must consider how long this process actually takes without automation. Without this tool, it needs to be accomplished manually and requires complicated analysis by staff that is well-experienced in this arena. However, when automation is utilized, policies are the monitors of databases. In this instance, exceptions actually trigger a notification by email which ultimately reports what sort of help is necessary in order to help find a solution to the problem.

Now, don’t hear what we are NOT saying here. Does automation make life easier for everyone? Yes! Is implementing automation an easy and seamless process? No! In fact, automation requires some detailed work prior to setting it up. This work is accomplished so that it is clear what needs to be monitored and how that monitoring should be carried out. Ultimately, a monitor list is created which will help to define and assign various policies. Overall, this list will make clear WHO gets sent a notification in the event of an exception, as well as what type of notification will be sent. Even further, this list will show what the trigger of the exception was, and in the end will assign a notification list to the policy. It may sound like a lot of work up front, but one could argue that it is certainly worth it in the long run.

When it comes to conditional reorgs, automation saves the day once again. Many clients can prove to be quite scattered with respect to their reorg cycle, some even organizing them throughout a spotty 4-week cycle. The issue here is that reorg jobs are scheduled during a particular window of time, without the time or resources even being evaluated. When clients choose to automate a reorg, automation will help determine the necessity of a reorg. The best part of the whole process is that no manual work is needed! In fact, the scheduler will continue to submit the reorg jobs, but execute them only if necessary. Talk about a good use of both time and resources! It ends up being a win-win situation.

Automation often ends up “saving the day” when it comes to meeting and exceeding client goals. Did you know that individual utilities -when teamed with the functionality and the vast capabilities of the automation process- actually improves overall business performance and value? It is true. So, if you are looking for the most cutting edge way to manage your IMS database, looking no further than the process of automation!


Jason Zhang is the product marketing person for Rocket Software's Backup, Storage, and Cloud solutions.