INDEX
Explanations
the specific word "irt" with varying degrees of emphasis
instances of the word "virtue" in various forms
New Auto-Interp
Negative Logits
ILCS
-0.68
reckoning
-0.67
interrupted
-0.67
ŃĶ
-0.62
cffffcc
-0.62
¬¼
-0.62
ULT
-0.61
©¶æ
-0.59
sych
-0.59
living
-0.59
POSITIVE LOGITS
ually
0.96
entials
0.96
alia
0.90
ilde
0.89
inyl
0.86
ables
0.82
ieth
0.82
ullah
0.80
igue
0.79
ilda
0.78
Activations Density 0.025%