class Linguist::Heuristics

A collection of simple heuristics that can be used to better analyze languages.

Constants

CPlusPlusRegex
HEURISTICS_CONSIDER_BYTES
ObjectiveCRegex

Common heuristics

Public Class Methods

call(blob, candidates) click to toggle source

Public: Use heuristics to detect language of the blob.

blob - An object that quacks like a blob. possible_languages - Array of Language objects

Examples

Heuristics.call(FileBlob.new("path/to/file"), [
  Language["Ruby"], Language["Python"]
])

Returns an Array of languages, or empty if none matched or were inconclusive.

# File lib/linguist/heuristics.rb, line 18
def self.call(blob, candidates)
  data = blob.data[0...HEURISTICS_CONSIDER_BYTES]

  @heuristics.each do |heuristic|
    if heuristic.matches?(blob.name, candidates)
      return Array(heuristic.call(data))
    end
  end

  [] # No heuristics matched
end
disambiguate(*exts_and_langs, &heuristic) click to toggle source

Internal: Define a new heuristic.

exts_and_langs - String names of file extensions and languages to

disambiguate.

heuristic - Block which takes data as an argument and returns a Language or nil.

Examples

disambiguate ".pm" do |data|
  if data.include?("use strict")
    Language["Perl"]
  elsif /^[^#]+:-/.match(data)
    Language["Prolog"]
  end
end
# File lib/linguist/heuristics.rb, line 46
def self.disambiguate(*exts_and_langs, &heuristic)
  @heuristics << new(exts_and_langs, &heuristic)
end
new(exts_and_langs, &heuristic) click to toggle source

Internal

# File lib/linguist/heuristics.rb, line 54
def initialize(exts_and_langs, &heuristic)
  @exts_and_langs, @candidates = exts_and_langs.partition {|e| e =~ /\A\./}
  @heuristic = heuristic
end

Public Instance Methods

call(data) click to toggle source

Internal: Perform the heuristic

# File lib/linguist/heuristics.rb, line 71
def call(data)
  @heuristic.call(data)
end
matches?(filename, candidates) click to toggle source

Internal: Check if this heuristic matches the candidate filenames or languages.

# File lib/linguist/heuristics.rb, line 61
def matches?(filename, candidates)
  filename = filename.downcase
  candidates = candidates.compact.map(&:name)
  @exts_and_langs.any? { |ext| filename.end_with?(ext) } ||
    (candidates.any? &&
     (@candidates - candidates == [] &&
      candidates - @candidates == []))
end