Invented by Anand Vibhor, Bhavyan Bharatkumar Mehta, Amey Vijaykumar Karandikar, Parag Gokhale, Commvault Systems Inc
The Commvault Systems Inc invention works as follows
In general, it is disclosed a data management system that allows files (and/or data) to be synchronized between two or more computing devices. This can also include a backup. The synchronization policies specify which files are to be synchronized according to selected criteria, including file metadata and location information. In general, the first step is to copy files from a client computer device into secondary storage. Then, the files that need to be synced are selected from secondary storage and copied to another client computing device. Also, synced files can be accessed and viewed through a cloud or remote file access interface.
Background for Data synchronization management
Global businesses recognize the commercial value and seek cost-effective, reliable ways to secure their information while minimizing productivity. Information protection is often part and parcel of an organizational process.
A company may back up important computing systems like web servers, file servers, web server, etc. as part of its daily, weekly or monthly maintenance plan. A company might also protect the computing systems of each employee, such as those used in an accounting, marketing, or engineering department.
Companies continue to look for innovative ways to manage data growth and protect data, given the ever-growing volume of data under their control. Companies often use migration techniques to move data to cheaper storage and data reduction techniques to reduce redundant data, prune lower priority data, and so forth.
Enterprises are also increasingly viewing their stored data as an asset. Customers are also looking for solutions to not only manage and protect their data, but also make the most of it. Solutions that provide data analysis, better data presentation, access and features, data synchronization and other similar capabilities are increasingly in demand.
In general, the aspects of the disclosure relate to a management system for data synchronization that can synchronize data (and/or files) between two or more computing devices and back up files. In one embodiment, the user of the data sync management system can specify one or multiple file synchronization policy. In a file synchronization, the user can specify which files to synchronize as well as to what devices and locations. A user can create a file synchronization and specify all files within a certain directory on their laptop to be synchronized. The user can also specify that the files should be synced to their desktop. Optionally, the user can specify a schedule for synchronization and/or whether they want a two-way or one-way file sync. Data backup policies can be used in conjunction with the data synchronization system.
In one embodiment, file synchronization is performed as follows. Specific files are copied or backed up from a source client computing device into a secondary storage. Files to be synchronized will be identified by the synchronization policies and whether they have been modified since the previous backup. If a file that is backed-up (and is to be synchronized), is unchanged compared to its previous backup, it is not synchronized. “Any files that are identified as being for synchronization will be synchronized with the destination client computing device, which is specified in accordance with the synchronization policies of the user.
In one embodiment, the user can specify which files are to be synchronized. The files to be synced may be selected directly or identified in another way, or criteria can be specified to identify them indirectly. Files can be identified, for example, by detecting specified content in the file (such a the presence of one or several specified terms within a file) and/or matching metadata associated with he file to one or more specified parameters (such a filename, owner of the file, directory, date of creation, date of modification, size, type of file, location, Global Positioning System coordinates (GPS), among others). The criteria for identifying files that need to be synced can vary. Files to be synchronized can be identified by identifying files that are in a directory and were created after a specific date.
In one embodiment, the data management system for synchronization may add location information into files that are being backed up. So, a file that is to be synchronized can be identified based on the geographical location where it was created or modified. Location information can include GPS coordinates, for instance, that are gathered by a client computing device, such as a GPS receiver.
In one embodiment, the user of a data synchronization system can access synchronized documents through a remote file access interface or cloud. After authenticating their identity, the user can access their files via a web-browser. In this embodiment, the user may be able to access a list of files at first. A user can then request to access a specific file. The user can then access the requested file by transferring the data from the secondary storage (backup).
In order to summarize the disclosure, some aspects, advantages, and novel features have been described. It is important to note that not all of these aspects, features, or advantages will necessarily be included in any one embodiment of the invention.
The synchronization method described in this embodiment involves: reviewing metadata for each file of a primary copy. This secondary copy is created when the files from the primary copy are copied to secondary storage devices.
According to one aspect, the communicating includes communicating a copy the at least one document that is accessed via the one or multiple secondary storage devices.
According to one aspect, the identifying operation is in response to secondary copies.
According to one aspect, at the very least, some metadata is associated with the files that are part of the secondary copies after the secondary copies have been initiated.
According to one aspect, the first client computing devices creates at least a portion of the metadata for each of the files included in the second copy before the secondary copy operation is initiated.
The method includes: “determining characteristics associated to each of the files in the primary storage device, after initiating the secondary copy operation; for each of these files, the synchronization module generates, based upon the user-defined criteria and the determined features, an indication as to whether the file should be synchronized with the primary storage device, and the secondary storage device. And including the indication along with the metadata associated to the file.
According to one aspect, the metadata accessed includes at least one file name, file owner, file directory, creation date, modification date, file size, file type or geographical location.
The method includes analyzing the content of the file and storing metadata about the content.
According to one aspect, user-defined synchronization requirements specify files for synchronization that are based in part on content metadata indicating a term or terms present within the file.
According to one aspect, at least one of the files communicated to the client computing device 2 replaces an older version of at least one of the files stored in the secondary storage device 2.
The system described in this embodiment is a system that synchronizes files between multiple clients computers by using file data. It comprises: a datastore; and a computer hardware module executing on one or several computer processors. This module can: access file data for each of one of more files in a copy of the file, which was created in a secondary backup operation, in which one or two files from the primary device were copied to secondary storage devices, creating a copy of the file.