INDEX
Explanations
specific patterns followed by 'V'
occurrences of the letter "V"
New Auto-Interp
Negative Logits
understatement
-0.70
scient
-0.63
barley
-0.62
bere
-0.61
istani
-0.60
summons
-0.59
upfront
-0.59
doomed
-0.58
dispers
-0.58
pumpkin
-0.57
POSITIVE LOGITS
V
3.52
Vs
2.16
v
2.05
VT
1.98
VA
1.97
VI
1.93
VL
1.89
VB
1.87
V
1.86
VD
1.85
Activations Density 0.019%