INDEX
Explanations
the concept of quantity, particularly the term "more" in various contexts
New Auto-Interp
Negative Logits
lance
-0.83
abad
-0.77
Rus
-0.75
vier
-0.75
ãĥĥãĥĪ
-0.72
staking
-0.72
hell
-0.70
velt
-0.70
Joy
-0.66
odor
-0.66
POSITIVE LOGITS
layers
0.99
consecutive
0.95
instances
0.94
hundred
0.88
phases
0.87
iterations
0.87
occasions
0.86
branches
0.86
columns
0.86
copies
0.86
Activations Density 0.022%