INDEX
Explanations
references to the word "kind" and its variations
New Auto-Interp
Negative Logits
ional
-0.17
ĤŃ
-0.16
aches
-0.15
gov
-0.15
agon
-0.15
ors
-0.15
ponent
-0.14
ãģ£ãģ¡
-0.14
EF
-0.14
ampie
-0.14
POSITIVE LOGITS
ergarten
0.33
red
0.32
reds
0.25
led
0.23
gom
0.23
ling
0.23
erg
0.22
ness
0.19
RED
0.19
-hearted
0.18
Activations Density 0.034%