INDEX
Explanations
statements about facts and their correctness, often in a socio-political context
New Auto-Interp
Negative Logits
uyu
-0.16
ÏĦε
-0.15
ÅĽ
-0.14
rome
-0.14
ILLED
-0.14
Reputation
-0.14
uffers
-0.14
consort
-0.14
flourish
-0.14
von
-0.14
POSITIVE LOGITS
importance
0.29
nature
0.23
Importance
0.23
import
0.19
nature
0.18
applic
0.17
role
0.17
elerik
0.16
import
0.16
univers
0.16
Activations Density 0.230%