filters module

Filter typosquatting-related lists.

A module that contains all functions that filter data related to typosquatting data.

filters.distance_calculations(package_of_interest, all_packages, max_distance=1)

Find packages <= defined edit distance and return sorted list.

Parameters:
  • package_of_interest (str) – package name on which to perform comparison
  • all_packages (list) – list of all package names
  • max_distance (int) – the maximum distance that justifies reporting
Returns:

potential typosquatters

Return type:

list

filters.filter_by_package_name_len(package_list, min_len=5)

Keep packages whose name is >= a minimum length.

Parameters:
  • package_list (list) – a list of package names
  • min_len (int) – a minimum length of charactersArgs
Returns:

filtered package names

Return type:

list

filters.homophone_attack_screen(package_of_interest, all_packages)

Find packages that prey on homophone confusion.

This screen checks for attacks that prey on user confusion related to homophones. For instance, ‘klumpz’ vs. ‘clumps’. This function helps find confusion attacks, rather than misspelling attacks.

Parameters:
  • package (str) – package name on which to perform comparison
  • all_packages (list) – list of all package names
Returns:

potential typosquatting packages

Return type:

list

filters.order_attack_screen(package, all_packages)

Find packages that prey on user confusion about order.

This screen checks for attacks that prey on user confusion about word order. For instance, python-nmap vs nmap-python. The edit distance is very high, but the conceptual distance is close. This function currently identifies only packages that capitalize on user confusion about word order when words are separated by dashes or underscores.

Parameters:
  • package (str) – package name on which to perform comparison
  • all_packages (list) – list of all package names
Returns:

potential typosquatting packages

Return type:

list

filters.whitelist(squat_candidates, whitelist_filename='whitelist.txt')

Remove whitelisted packages from typosquat candidate list.

Parameters:
  • squat_candidates (dict) – dict of packages and potential typosquatters
  • whitelist_filename (str) – file location for whitelist
Returns:

packages and post-whitelist potential typosquatters

Return type:

dict