INDEX
Explanations
phrases that introduce evidence or reasoning
by assumption
New Auto-Interp
Negative Logits
surla
-0.84
iſen
-0.80
kasarigan
-0.79
ſicht
-0.77
ſcher
-0.77
KommentareTeilen
-0.77
Akismet
-0.76
للاسماء
-0.74
informée
-0.74
otomatig
-0.73
POSITIVE LOGITS
The
0.45
the
0.43
The
0.35
själva
0.32
.
0.31
Since
0.31
gegevens
0.31
wypad
0.30
zamanda
0.30
As
0.30
Activations Density 0.044%