INDEX
Explanations
strong emphasis on words within apostrophes
occurrences of the apostrophe
New Auto-Interp
Negative Logits
rador
-0.59
asma
-0.59
manship
-0.58
reconciliation
-0.58
Pru
-0.57
answ
-0.57
fundamentals
-0.57
asing
-0.56
İĭ
-0.55
bonds
-0.55
POSITIVE LOGITS
Cause
0.97
Allah
0.89
Mech
0.78
Em
0.77
S
0.77
Angelo
0.72
ead
0.72
Brien
0.72
thur
0.71
uese
0.71
Activations Density 0.046%