INDEX
Explanations
mathematical comparisons and relationships, especially involving values and operations
after conjunctions or punctuation
New Auto-Interp
Negative Logits
nemlig
-0.63
msgTypes
-0.60
SequentialGroup
-0.59
enää
-0.59
geliyor
-0.58
prøve
-0.55
оригіналу
-0.55
enumi
-0.54
varandra
-0.54
<bos>
-0.54
POSITIVE LOGITS
water
0.87
trees
0.84
food
0.79
eating
0.78
shoes
0.73
dogs
0.72
cars
0.72
foods
0.71
airplanes
0.70
music
0.70
Activations Density 0.654%