Model-based fraud detection in growing networks
This notebook illustrates the dynamics of a model that captures the average behavior of regular users. Based on the asymptotic behavior of the local clustering, our paper proposes an algorithms that evaluates the degree of membership of each user to well-defined communities, as well as to close-knit groups. The video below shows how communities are formed.
Under the assumption that fraudsters engage in deceptive transactions in a way that resembles random link attacks, the resulting dynamics are shown below.
Crime hotspots in Chicago
The left plot below illustrates the total amount of crime instances that occurred in the city of Chicago during a January 1st, 2013. The right plot shows the location of police cameras.
illustrates the areas in Chicago that had a high crime rate during the first semester of 2013. Each time step captures the distribution of all types of crimes within the past 7 days. Visualizing the evolution of crime shows that hotspots are defined within specific areas (left plot). The right plot represents the hotspots generated by location of the police cameras. Hotspots are created using optimal bandwidth for standard normal data.
Arrest patterns are shown in the video below. The left plot illustrates the distribution of crimes that lead an arrest; the right plot shows again the location of police cameras.
On the stability of resource undermatching in human group-choice
The notebook simulates the model presented at the ACC’14, which captures empirical patterns of the aggregate group-level behavior of individuals competing for resources. When, on a logarithmic plot, there is strict matching between the ratios of the number of individuals choosing any two options (xi/xj) and number of available resources for these options (wi/wj), then the pattern is called an Ideal Free Distribution with no undermatching. The video below illustrates the dynamics of how this stationary state is reached based on local decision-making.
The left plot shows the evolution of the opportunity cost as perceived by single individuals; the center plot the number of resources available per individual and the right plot the ratios for various option pairs. The diagonal dotted line represents the strict match between choices and resources.
However, empirical patterns show that the IFD state is reached with some degree of understating. Undermatching can be described as a globally balanced state in which the perceived cost of the best forgone alternatives is approximately the same for all individuals. The video below illustrates the dynamics for the case with undermatching then there are no dynamics once the IFD is reached.
The right plot shows how the ratio of individuals choosing two different options relates to number of locally available resources on the logarithmic plot. The dots indicate deviations in group-choice behavior from the diagonal line which represents strict matching. It illustrates that there is low discriminability of resources, represented by a = 0.57. When the state reaches the IFD (at around k = 115), variations in the number of resources per individual are small for any option, but the ratio of resources between two options does not match the ratio of individuals choosing these options. Individuals choosing the option with the most resources enjoy higher resources rates.
Depending on the distribution of resources, it can be the case that individuals keep on changing options even after reaching the IFD with understating. The video below illustrates this case.
To download this notebook click here.
A network of the types of crimes in San Francisco, CA
Relationships between different types of crimes can be visualized as a network, with a node representing each type of crime (e.g., bribery, stolen property, or drug-dealing) and a link quantifying the co-occurrence of two types of criminal activities in the same neighborhood (within a time window). This notebook illustrates the relationships between various types of crimes in San Francisco, CA (data available at https://data.sfgov.org). Two types of crimes are related if most instances associated to both types happened in the same neighborhood within the same day.
The video below highlights the types of crime with the two highest eigenvector centralities, which correspond to nodes that are connected to many other well-connected nodes.
The video below highlights the types of crime with the two highest eccentricity centralities, which corresponds to the nodes that are at short maximum distances to every other node.
The video below highlights the types of crime with the two highest degree centralities, which corresponds to the nodes that have high vertex degrees.
The video below highlights the types of crime with the two highest betweenness centralities, which corresponds to the nodes that are on many shortest paths of other node pairs.
Next, the plots below show the degree distribution (for different days) during a 90-day period. Simulations are based on uniform neighborhoods with radiii of 0.004 degrees (centered around each instance).
To download this notebook click here. Similar plots of the degree distribution for the City of Chicago (for a 10-day period) are shown below. The notebook can be downloaded here.
Structure of growing networks with no preferential attachment
This notebook illustrates the asymptotic behavior of the degree distribution for highly clustered networks. In particular, it tries to explain the formation of structural properties under the two following conditions:
- The formation of links cannot be described according to the principle of preferential attachment; and
- The in-degree distribution fits a power law for nodes with a high degree and an exponential form otherwise (i.e., an extended power).
Extended power laws are – in some contexts – a better fit than single or double power laws, e.g., to describe the degree distribution of online social networks and patent citation networks. Our paper characterizes the relation between the scaling exponent and the probability of forming triads. The transition from exponential to power law distributions depends on both the scaling exponent and the number of links that newly added nodes establish. Note that the underlying mechanism accounts for strong neighborhood clustering based on a random triad formation process. Average clustering properties remain constant as the size of the network grows. To download the notebook click here.