INDEX
Explanations
phrases related to policies and ideals
New Auto-Interp
Negative Logits
amaz
-0.73
kernel
-0.66
otos
-0.61
ModLoader
-0.61
termin
-0.61
mattress
-0.60
izont
-0.60
orman
-0.59
Spoiler
-0.59
toggle
-0.58
POSITIVE LOGITS
alike
1.59
thereof
1.11
respectively
1.03
relating
0.91
belonging
0.86
therein
0.85
pertaining
0.84
gal
0.80
hips
0.80
depending
0.79
Activations Density 0.258%