INDEX
Explanations
references to locations or origins within texts
New Auto-Interp
Negative Logits
متعلقه
-0.96
myſelf
-0.92
LookAnd
-0.89
pleaſure
-0.88
IntoConstraints
-0.88
ſtate
-0.86
">//
-0.85
فريبيس
-0.85
himſelf
-0.83
ImageContext
-0.83
POSITIVE LOGITS
a
0.90
very
0.80
an
0.72
only
0.68
quite
0.68
the
0.64
keinem
0.64
neither
0.63
יקר
0.61
relatively
0.58
Activations Density 0.518%