Invented by Justin D. Call, Subramanian Varadarajan, Xiaochan Huang, Xiaoming Zhou, Marc R. Hansen, Shape Security Inc

The market for protecting server computers by detecting the browser identity on client computers has seen significant growth in recent years. With the increasing reliance on the internet for various business operations, server security has become a top priority for organizations worldwide. Detecting the browser identity on client computers has emerged as an effective method to enhance server security and prevent unauthorized access.

In today’s digital landscape, server computers store and process vast amounts of sensitive data, making them attractive targets for cybercriminals. Hackers constantly evolve their techniques to exploit vulnerabilities and gain unauthorized access to servers. Traditional security measures, such as firewalls and antivirus software, are no longer sufficient to protect against sophisticated attacks.

Detecting the browser identity on client computers offers an additional layer of security by verifying the authenticity of the user accessing the server. This method involves analyzing the browser’s user agent string, which contains information about the browser type, version, and operating system. By comparing this information with known patterns of legitimate user agents, suspicious or malicious activities can be identified and blocked.

One of the key advantages of detecting the browser identity is its ability to detect and prevent browser-based attacks, such as cross-site scripting (XSS) and cross-site request forgery (CSRF). These attacks exploit vulnerabilities in web applications and can lead to unauthorized access or data breaches. By identifying the browser identity, server administrators can implement specific security measures to mitigate these risks.

Furthermore, detecting the browser identity can help prevent session hijacking and identity theft. By monitoring the user agent string, server administrators can detect anomalies or inconsistencies that may indicate a compromised session or an impersonation attempt. This allows for immediate action to be taken, such as terminating the session or requesting additional authentication.

The market for protecting server computers by detecting the browser identity on client computers has seen a surge in demand due to the increasing number of cyber threats and the growing awareness of the importance of server security. Organizations across various industries, including finance, healthcare, and e-commerce, are investing in robust security solutions to safeguard their servers and protect sensitive data.

Several companies have emerged as key players in this market, offering innovative solutions that utilize advanced algorithms and machine learning techniques to detect and analyze browser identities. These solutions provide real-time monitoring and alerting capabilities, allowing administrators to respond swiftly to potential threats.

As the market continues to grow, it is expected that the demand for solutions that detect browser identities will increase. Server administrators are becoming more proactive in their approach to security, recognizing the need for comprehensive measures to protect against evolving cyber threats. Additionally, regulatory requirements, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), are driving organizations to implement stronger security measures to ensure compliance.

In conclusion, the market for protecting server computers by detecting the browser identity on client computers is witnessing significant growth. This method offers an effective way to enhance server security, prevent unauthorized access, and mitigate various browser-based attacks. As organizations prioritize server security and regulatory compliance, the demand for solutions that detect browser identities is expected to continue rising.

The Shape Security Inc invention works as follows

The computer-implemented methods for identifying abnormal behavior include receiving data from a server subsystem that characterizes the subsets for the particular document objects models rendered by specific client computers. Identifying clusters from these data to characterize the subsets for the particular document objects models. And using the clusters in order to identify alien content, which is content within the document object modeling that is not derived from content that is based on the document model served.

Background for Protecting server computers by detecting the browser identity on client computers

Computer fraud is a big business, both for fraudsters and those who are trying to stop it. Computer fraud is a common practice where organizations try to hack into the computers of people to get them to give up sensitive information or access codes. Fraudster software can be used to make computers act in an automated, malicious way. These computers are often referred to as “bots”. These computers can act in concert, under the control and direction of a single entity. This is known as a “botnet”. Bots can be programmed to perform illegitimate actions, such as contacting banks or retailers, and using their credit cards or other financial details. These malicious codes may also perform a “Man in the Browser” attack. A ‘Man in the Browser’ attack is a method of providing code to a computer that intercepts communications between the user and his bank. This occurs after the communications are decrypted by the computer’s web browser. This malicious code can alter the interface the user sees. For example, it may generate an interface which looks like the user’s banking institution is asking for certain information (e.g. a PIN) when the bank never requests such information through a web site. The malicious code can also generate an interface which indicates to the user that the banking or shopping transaction was executed according to the request of the user, but in reality, the organization has altered the transaction to send money to an entity affiliated with it.

Several approaches have been used to identify and stop such malicious activities.” To detect inappropriate activity, for example, software has been developed to run on the client computers of the organization that owns and operates those computers.

This document describes the systems and techniques that can be used to report anomalous behavior by “alien” code on client computers. Code on client computers may be reported to the central service of a server central system. The central service can then identify clusters of anomalous acts (i.e. anomalous acts that are sufficiently similar to one another and sufficiently different from other actions to be identified as related and likely generated by the same source of code), to identify code distributed widely to many machines that performs similar actions (i.e. to identify actions performed by code coming from a particular source) and to determine if such anomalous acts are benign or malignant. Benign actions can be caused by a web browser plug in that users installed for their own purposes. Hidden code may have been secretly installed on the computers of users by a criminal organization in order to collect private information or set up botnets or other malicious or illegal endeavors.

The abnormal action may be identified by alien code taking actions that are not appropriate for a webpage that the content provider served to the computer. To identify anomalous action, code (e.g. HTML, CSS or JavaScript), for a webpage, may be rewritten by an adjunct system to the web server to alter the way a page will be served every time someone accesses it. Modifications may be made to a web site and its related content at different times, either to the same computer or different computers. Two different users, or a single user during two separate web browsing sessions, may receive slightly different code as a response to the same request. The difference could be in the implicit parts of code that aren’t displayed, so the differences won’t be noticeable by the user. The names of software objects that are sent to the client device may change in an essentially random way each time a webpage is served.

This randomness prevents malicious parties from attempting to reverse-engineer the code or writing bots that exploit the page. They may interact with the original source code if they are unaware that the server system is revising the code. Clients may receive rewritten code for web pages along with instrumentation codes that identify and report such actions to the server systems.

Alternatively or additionally, code can be provided which analyzes periodically the document object model (DOM) on different machines that have code loaded for a specific web site or page. It may then walk the DOM to produce a compact version of the DOMs. The code added may, for example, compute a hash or multiple hashes at different times of content or a part of it, or another representation of DOM or a part of it. This code (e.g. added JavaScript), can also calculate changes in DOMs over time on certain client devices (e.g. at defined states of a page such as before and following a user’s action to order an item online or make a change to a bank account). The code added may then communicate to the server the information it has gathered about the DOMs.

The code can also collect and report meta data about the device’s state, including its IP address (which is used to determine the geographic location), the version and brand of web browsers that rendered the page, plug-in information, GPS location information, etc. In one example the code could compute a hash based on the difference between a DOM or a part of a DOM before and after an operation such as the user clicking a button in a purchase process. It would then report this hash back to the central system since hashes allow for easy comparison and analysis.

In certain implementations, malicious activities can be deflected and detected in a relatively sophisticated way by changing the environment where executable code, like JavaScript, runs on the client device (in addition, to changing the HTML code corresponding references). Document.write is a method that malicious code can use to change the DOM (document object model) of a webpage. This method may be used to alter what a user sees when they view a page. A security system can (1) instrument served code corresponding to such a method so that the instrumentation code reports calls to the method, and additional includes data that characterizes such calls, so that the system can detect abnormal activity and perhaps use the additional data to determine whether the abnormal activity is malicious or benign; and (2) change the function name to ?document.#3@1*87%5.write,? ?1@2234$56%.4$4$345%4. @12111@, or any other legal name containing random text which can be automatically changed each time the code has been served. This constant change presents a challenge to malicious parties who want to keep up. It also highlights the presence of malicious code by reporting it when the code tries to interact with a method name that is outdated. Other JavaScript actions which can be instrumented and continuously changed are?getElementById?,? ?getElementByName,? XPath command and setting HTML elements within the DOM with particular values.

The server system can then store information from the extra code that is running on many different client devices and analyze an aggregated representation of this received information in order to identify anomalous behavior by different devices. An operator of the server may identify certain features as relevant features to identifying malicious activity. Examples include geographic location, browser type, specific elements of the DOM and so on.

To perform this analysis, it is possible to assign a dimension to each feature in a hyperplane. A learning engine can then analyze the hyperplane to determine if clusters or any other patterns are emerging. Clustering hyperplanes is computationally simpler than other techniques and can indicate alien code in some circumstances. Focusing only on data from a certain class of browsers, for example, can increase the likelihood of DOM anomalies above a threshold that is recognizable. Clustering simultaneously in multiple planes can indicate that an exploit is being tried on a specific brand or browser version (e.g. a zero-day exploit). Other clusters of similar nature can also be identified to determine the presence alien content on specific client computers.

The clusters can be further analysed to determine if the alien content is benign, or malicious. If malicious, it will then be determined what action to take. For example, whenever multiple clusters are present, the clusters may be analyzed for the presence of previously-identified alien content. The content can be identified by plug-ins that have been identified before, in which case it is safe to ignore the results. The presence of benign content may be used to ‘excuse’ a certain event. A certain event may be occurring in a large population of computers. The content may conversely be identified as having been inserted by previously identified malware/crimeware/vandleware/etc. This information can then be exported from an infected system to another system which can block any further transactions.

The analysis described here can be performed on a single web site or domain or across several different sites, e.g. by a third-party that offers web security products or services to domain operators. A cross-domain operator could, for example, identify common interactions across domains. For instance, if several domains share the same underlying software, they might compare the DOMs of those sites before and after the common action. It is possible that a malicious party will focus on this event to intercept information. The difference between a “clean” computer and an infected computer would be different. “A malicious party may be assumed to focus on such an event for intercepting information, so that the difference across such an action will differ as between a?clean?

Such detection activities across multiple web transactions or session may also be coordinated with other activities in order to deflect malicious action for specific transactions or sessions. (i.e. deflection occurs at individual clients as well as at a central system and can happen with or without an initial detection. Instrumentation code added to the code provided by a server system can identify anomalous activities and alert the server system. In this situation, the system could carry out the transaction so that the malicious party thinks nothing has happened. However, it may have avoided carrying out a financial transaction.

The code running on the client may, for example, be programmed to identify a function call that does not match the permitted function calls that are allowed for a particular served web page (e.g., where the alien call is one that matches the original page provided by a web server but does not match the revised name generated by the techniques discussed in the previous paragraph). This code can be used to identify function calls for function names that do not match those that are allowed for a specific served web page. This alien content could be a benign or legitimate piece of code installed on the client’s machine, like a browser plug-in, or an indication that their computer is infected by malicious code.

In one implementation, an computer-implemented technique for identifying abnormal behavior can include: receiving, at a server subsystem, data describing subsets for particular document objects models rendered by specific client computers; identifying from the data clusters that describe the subsets for the particular document objects models; and utilizing the clusters to detect alien content on particular client computers. The alien content is content in document object modeling that is not a result of the content that is a basis for the document model served.

The following features can be included in this and other implementations described. Code provided by the computer subsystem to the client computers and the web pages can generate data that characterizes the subsets of document object models. Code provided by the server subsystem of the computer can identify changes in the document object models when events occur relating to web pages. Events can include user-defined selections of objects on web pages. The code provided by computer server subsystems performs reduction processes in order to generate strings which are significantly smaller than document object model and characterize the content of document object model. The reduction processes may include a hashing function performed on portions of document object models. Clusters can be identified using data from specific client computers plotted on a hyperplane.

These and other implementaions described herein may optionally include some or all of the following features. Hyperplanes can be defined using dimensions that correspond with web page features identified as relevant for determining if the actions are malicious or benign. The cluster can be used to identify alien content by identifying features that were previously associated with benign variations on client computers.

The method can further include sending by the computer subsystem at least a part of the data which characterizes subsets or particular document object model to a central server security, wherein the server security is configured to receive the data from a plurality computer subsystems. The method may also include receiving data at the computer subsystem that indicates context for at least one (i) of the web pages or (ii), the specific client computers. The context data can include information that identifies particular client computers or an application that rendered a particular web page on particular client computers. “Using the context data can be used to identify clusters in the data that characterizes subsets of a particular document object model.

The hyperplane can be defined as dimensions which include an identifier for the specific client computers, a identifier for an application that rendered the webpage at the particular client computers and an identification of a website or web domain that served certain web pages.

The clusters” can be identified at least partly based on the identities of specific web pages, or the identities of domains which served certain web pages. The method may also include determining if the identified alien content was benign or malicious. Comparing the identified alien material to other alien materials that have been previously identified as benign or malignant can be used to determine whether the alien content is malicious or benign.

In one implementation, the computer system can consist of: one or multiple computing devices, an instrumentation module that is installed on one or several computing devices, and is configured for supplementing web code with instrumentation codes that are executable on client devices and can collect information on execution of web code at client devices, the information including representations of document object models of the Web code, and a Security Monitoring Module that is installed on one or many computing devices, and is configured to analyze the information collected from a number of client devices by instrumentation codes on

Click here to view the patent on Google Patents.