INDEX
Explanations
discussions surrounding objections and arguments
arguments and excuses
New Auto-Interp
Negative Logits
Autoritní
-0.74
GEBURTSDATUM
-0.73
disambiguazione
-0.69
purpoſe
-0.67
queſta
-0.67
Reſ
-0.66
للاسماء
-0.66
GenerationType
-0.65
ModelExpression
-0.63
rungsseite
-0.63
POSITIVE LOGITS
argument
0.67
arguments
0.60
excuses
0.54
argue
0.51
Argument
0.50
argumentos
0.49
argument
0.49
excuse
0.48
objections
0.47
argued
0.47
Activations Density 0.162%