minimel.run module

minimel.run.vectorize_text(texts, vectorizer=None, dim=None)
minimel.run.get_scores(golds, preds, per_name=False)
class minimel.run.MiniNED(dawgfile: Path, candidatefile: Path | None = None, modelfile: Path | None = None, vectorizer: Path | None = None, ent_feats_csv: Path | None = None, lang: str | None = None, fallback: Path | None = None)

Bases: object

Parameters:
predict(text: str, name: str, upperbound=None, all_scores=False)

Make NED prediction.

Parameters:
  • text (str) – Some text

  • name (str) – An entity name in text

Keyword Arguments:
  • all_scores – Output all candidate scores

  • upperbound – Create upper bound on performance

Returns:

Wikidata ID

minimel.run.run(dawgfile: Path, candidatefile: Path | None = None, modelfile: Path | None = None, *runfiles: Path, outfile: Path | None = None, vectorizer: Path | None = None, ent_feats_csv: Path | None = None, lang: str | None = None, fallback: Path | None = None, evaluate: bool = False, evalfile: Path | None = None, evalfile_per_name: Path | None = None, predict_only: bool = True, all_scores: bool = False, upperbound: bool = False, split: int | None = None, fold: int | None = None)

Perform entity disambiguation

Parameters:
Keyword Arguments:
  • outfile – Write outputs to file (default: stdout)

  • vectorizer – Scikit-learn vectorizer .pickle or Fasttext .bin word embeddings. If unset, use HashingVectorizer.

  • ent_feats_csv – CSV of (ent_id,space separated feat list) entity features

  • fallback – Additional fallback deterministic name -> ID json

  • evaluate – Report evaluation scores instead of predictions

  • evalfile – Write evaluation results to file

  • evalfile_per_name – Write evaluation results per name to file

  • predict_only – Only print predictions, not original text

  • all_scores – Output all candidate scores

  • upperbound – Create upper bound on performance

  • split – Split the data into several parts

  • fold – Use only this fold of the split data

minimel.run.evaluate(goldfile: Path, *predfiles: Path, agg: List[Path] = (), evalfile: Path | None = None)

Evaluate predictions

Parameters:
Keyword Arguments:
  • agg – Aggregation jsons (TODO: depend on data…?)

  • evalfile – Write evaluation results to file