INDEX
Explanations
the article "a" indicating the presence of singular nouns
New Auto-Interp
Head Attr Weights
0:0.07
1:0.05
2:0.09
3:0.08
4:0.08
5:0.08
6:0.08
7:0.10
8:0.08
9:0.07
10:0.07
11:0.09
Negative Logits
版
-1.73
dinand
-1.72
opol
-1.70
ottesville
-1.66
raviolet
-1.64
ricanes
-1.64
perse
-1.58
)'
-1.57
mainland
-1.56
ertodd
-1.54
POSITIVE LOGITS
Aid
1.91
understatement
1.78
esteem
1.66
encour
1.57
AE
1.54
WARN
1.53
feeling
1.52
morale
1.52
intolerance
1.50
indifference
1.50
Activations Density 0.000%