Argument Mining API documentation


The Argument Mining API serves as a direct interface to the ArgumenText project. It can be used to bypass the user interface and directly access the search engine via HTTP POST. Two different APIs are provided; the search API and the classify API.

Request API Access

If you want to have access to the API, please register on this site.

With an API login, permission is granted (free of charge) to use the services provided on this website. We make no warranties at all.

Classify API

Classifies given texts based on a search topic.

The classify API works similar to the search on the main page but takes list of sentences or a text or an URL with text as an input instead of searching for documents in an index. The classify API can be accessed via https://api.argumentsearch.com/en/classify. Both input and output are in JSON format.

Parameters

The description of each key in input JSON is described below -

topic : str
Search query
sortBy : str

Sort by argumentConfidence (Default) or argumentConfidenceLex

argumentConfidence: Sort by average of argument and stance (if applicable) confidence.

argumentConfidenceLex: Sort by Numpy's Lexsort.

Leave empty if the sentences should not be sorted (keeps the input order).

userID : str
Pass your personal userID.
apiKey : str
Pass your personal apiKey.
model : str
Model to be used. Available options are default (Default), default_topic_relevance, bert_base, bert_base_multiling_de (only for German).
sentences : list of str
Sentences to be classified. Either sentences or text or targetUrl can be used at a time.
text : str
Text to be classified. Either sentences or text or targetUrl can be used at a time.
targetUrl : str
URL of the texts to be classified. Either sentences or text or targetUrl can be used at a time.
predictStance : bool, optional
Predict stances of arguments if true (Default: true).
computeAttention : bool, optional
Computes attention weights if true (Default: true). Does not work for BERT based models.
showOnlyArguments : bool, optional
Shows only argumentative sentences if true else shows all sentences (Default: true).
removeDuplicates : bool, optional
Removes duplicate sentences if true (Default: true)
filterNonsensicalEntries : bool, optional
Only keep sentences that have between 3 and 30 tokens and less than 4 sequentially repeating words (Default: true)
topicRelevance : str, optional

Filter the sentences based on given strategy. Available options are match_string, n_gram_overlap and word2vec (Default:None).

match_string: Selects sentences if the provided topic is in the sentence.

n_gram_overlap: Selects sentences if any of the nouns in topic is in the sentence. If there are no nouns in the topic, then stopwords are removed from topic and checked if the remaining tokens are in the sentences. If there are no tokens after removings stopwords (i.e. all the words in the topic are stopwords), all the sentences are returned.

word2vec: Selects the sentences based on the cosine similarity if a certain threshold is exceeded. The default model is used for calculating cosine similarity irrespective of a model used for prediction.

topicRelevanceThreshold : float, optional
Threshold against which the calculate cosine similarity is compared. Should be between 0 and 1 (Default: 0).
normalizeTopicRelevanceThreshold : bool, optional
If true, normalize the cosine similarities of all sentences before applying topicRelevanceThreshold (Default: false).
userMetadata : str, optional
Custom meta data in form of a String that will be returned (unmodified) with the result of the query (Default: "").

Returns

JSON

The output JSON contains the two keys metadata and sentences , which are explained below. sentences are a list of sentences and the parameters listed below are returned for each of the sentences. The input parameters are also returned and not explained again.

metadata

  • modelVersion: Version of the current running model (e.g. 0.1).

  • timeArgumentPrediction: Time needed to predict the arguments in seconds.

  • timeAttentionComputation: Time needed to compute the attention weights for all words in seconds, -1 if computeAttention=false.

  • timePreprocessing: Time needed to preprocess the documents/sentences in seconds.

  • timeStancePrediction: Time needed to predict the stances of all arguments in seconds, -1 if predictStance=false.

  • timeTotal: Time needed to process all data in seconds.

  • totalArguments: Total number of sentences that are arguments.

  • totalContraArguments: Total number of contra arguments that were found in the data.

  • totalProArguments: Total number of pro arguments that were found in the data.

  • totalNonArguments: Total number of sentences that are no arguments.

  • totalClassifiedSentences: Total number of sentences classified.

sentences

  • argumentConfidence: Confidence that sentence is an argument.

  • argumentLabel: Argument label for sentences (argument or no argument)

  • sentenceOriginal: Original sentence before preprocessing (e.g. Nuclear power is awesome).

  • sentencePreprocessed: Original sentence after preprocessing (e.g. nuclear power is awesome).

  • sortConfidence: A combined score of argument and stance confidence. If stance is not predicted, it's the same score as argumentConfidence.

  • stanceConfidence: Confidence that argument is pro/contra in regard to the topic (only if sentence is an argument and predictStance=true).

  • stanceLabel: Stance label for the argument (only if sentence is an argument and predictStance=true (pro or contra).

  • weights: Weights that signal the importance of each word of the sentence (only if computeAttention=true, e.g. [0.2, 0.3, 0.4, 0.1]).

Example 1: Classify API with sentences as input

{
    "topic": "Nuclear power",
    "sentences": [
        "Nuclear power is awesome.",
        "Nuclear power is awesome, because of its nearly zero carbon emissions.",
        "Nuclear power is dangerous, because it produces radioactive waste."
    ],
    "predictStance": true,
    "computeAttention": true,
    "showOnlyArguments": false,
    "userID": "yourPersonalUserID",
    "apiKey": "yourPersonalApiKey"
}

Example 2: Classify API with text as input

{
"topic": "Nuclear power",
"text": "Nuclear energy outputs nearly zero carbon emissions. But it is also dangerous,
        because of the nuclear waste it produces.",
"predictStance": true,
"computeAttention": true,
"showOnlyArguments": false,
"userID": "yourPersonalUserID",
"apiKey": "yourPersonalApiKey"
}

Example 3: Classify API with URL as input

{
"topic": "Brexit",
"targetUrl": "https://www.washingtonpost.com/world/2018/12/14/is-theresa-may-bad-
            negotiator-or-is-brexit-just-an-impossible-proposition-answer-yes",
"predictStance": true,
"computeAttention": true,
"showOnlyArguments": false,
"userID": "yourPersonalUserID",
"apiKey": "yourPersonalApiKey"
}

Raises

KeyError
If unknown model is used.
Search API

The search API works similar to the search on the main page. The search API can be accessed via https://api.argumentsearch.com/en/search. Both input and output are in JSON format .

Parameters

The description of each key in input JSON is described below -

topic : str
Search query
index : str
Index server to use (Default: cc).
sortBy : str

Sort by argumentConfidence (Default) or argumentConfidenceLex

argumentConfidence: Sort by average of argument and stance (if applicable) confidence.

argumentConfidenceLex: Sort by Numpy's Lexsort.

Leave empty if the sentences should not be sorted (keeps the input order).

numDocs : int
Number of documents scanned for arguments. The higher the number, the longer the process may take (Default: 20).
userID : str
Pass your personal userID.
apiKey : str
Pass your personal apiKey.
beginDate : str
Start date from which the documents are searched in the index (in format: yyyy-MM-dd'T'HH:mm:ss).
endDate : str
End date up to which the documents are searched in the index (in format: yyyy-MM-dd'T'HH:mm:ss).
strictTopicSearch : bool
If true, returns only sentences that contain exact matches of the topic
model : str
Model to be used. Available options are default (Default), default_topic_relevance, bert_base, bert_base_multiling_de (only for German).
predictStance : bool, optional
Predict stances of arguments if true (Default: true).
computeAttention : bool, optional
Computes attention weights if true (Default: true). Does not work for BERT based models.
showOnlyArguments : bool, optional
Shows only argumentative sentences if true else shows all sentences (Default: true).
removeDuplicates : bool, optional
Removes duplicate sentences if true (Default: true)
filterNonsensicalEntries : bool, optional
Only keep sentences that have between 3 and 30 tokens and less than 4 sequentially repeating words (Default: true)
topicRelevance : str, optional

Filter the sentences based on given strategy. Available options are match_string, n_gram_overlap and word2vec (Default:None).

match_string: Selects sentences if the provided topic is in the sentence.

n_gram_overlap: Selects sentences if any of the nouns in topic is in the sentence. If there are no nouns in the topic, then stopwords are removed from topic and checked if the remaining tokens are in the sentences. If there are no tokens after removings stopwords (i.e. all the words in the topic are stopwords), all the sentences are returned.

word2vec: Selects the sentences based on the cosine similarity if a certain threshold is exceeded. The default model is used for calculating cosine similarity irrespective of a model used for prediction.

topicRelevanceThreshold : float, optional
Threshold against which the calculate cosine similarity is compared. Should be between 0 and 1 (Default: 0).
normalizeTopicRelevanceThreshold : bool, optional
If true, normalize the cosine similarities of all sentences before applying topicRelevanceThreshold (Default: false).
userMetadata : str, optional
Custom meta data in form of a String that will be returned (unmodified) with the result of the query (Default: "").

Returns

JSON

The output JSON contains the two keys metadata and sentences , which are explained below. sentences are a list of sentences and the parameters listed below are returned for each of the sentences. The input parameters are also returned and not explained again.

metadata

  • language: Language of the model (en or de).

  • modelVersion: Version of the current running model (e.g. 0.1).

  • timeArgumentPrediction: Time needed to predict the arguments in seconds.

  • timeAttentionComputation: Time needed to compute the attention weights for all words in seconds, -1 if computeAttention=false.

  • timePreprocessing: Time needed to preprocess the documents/sentences in seconds.

  • timeIndexing: Time needed to find and return documents from the index in seconds.

  • timeStancePrediction: Time needed to predict the stances of all arguments in seconds, -1 if predictStance=false.

  • timeTotal: Time needed to process all data in seconds.

  • totalArguments: Total number of sentences that are arguments.

  • totalContraArguments: Total number of contra arguments that were found in the data.

  • totalProArguments: Total number of pro arguments that were found in the data.

  • totalNonArguments: Total number of sentences that are no arguments.

  • totalClassifiedSentences: Total number of sentences classified.

sentences

  • argumentConfidence: Confidence that sentence is an argument.

  • argumentLabel: Argument label for sentences (argument or no argument)

  • date: Creation date of the document the sentence origins from.

  • sentenceOriginal: Original sentence before preprocessing (e.g. Nuclear power is awesome).

  • sentencePreprocessed: Original sentence after preprocessing (e.g. nuclear power is awesome).

  • sortConfidence: A combined score of argument and stance confidence. If stance is not predicted, it's the same score as argumentConfidence.

  • source: Source of the document where the sentence origins from.

  • stanceConfidence: Confidence that argument is pro/contra in regard to the topic (only if sentence is an argument and predictStance=true).

  • stanceLabel: Stance label for the argument (only if sentence is an argument and predictStance=true (pro or contra).

  • url: URL of the document where the sentence origins from.

Example

{
    "topic": "Nuclear power",
    "predictStance": true,
    "computeAttention": false,
    "numDocs": 20,
    "sortBy": "argumentConfidence",
    "userID": "yourPersonalUserID",
    "apiKey": "yourPersonalApiKey"
}

Raises

KeyError
If unknown model is used.
Generated by pdoc 0.6.3.