INDEX
Negative Logits
s
0.97
h
0.74
}
0.70
v
0.69
com
0.65
ags
0.64
k
0.63
curr
0.62
embodies
0.62
bu
0.61
POSITIVE LOGITS
라
0.93
thei
0.75
기
0.72
chronological
0.68
어
0.66
etically
0.66
他
0.66
ม
0.65
नहर
0.63
두
0.63
Activations Density 0.002%
s
h
}
v
com
ags
k
curr
embodies
bu
라
thei
기
chronological
어
etically
他
ม
नहर
두