INDEX
Explanations
references to specific publications or archives
New Auto-Interp
Head Attr Weights
0:0.33
1:0.02
2:0.01
3:0.07
4:0.09
5:0.04
6:0.03
7:0.01
8:0.28
9:0.05
10:0.01
11:0.01
Negative Logits
handshake
-1.99
passwords
-1.69
hugs
-1.63
pledges
-1.60
selfies
-1.59
istor
-1.59
answers
-1.58
behaviours
-1.58
resumes
-1.55
scaling
-1.53
POSITIVE LOGITS
Canyon
1.57
Rim
1.57
ribune
1.54
Zone
1.53
Rated
1.49
Alley
1.48
Ble
1.47
idates
1.45
rica
1.42
Slam
1.40
Activations Density 0.000%