INDEX
Explanations
Japanese and Korean adjectives
New Auto-Interp
Negative Logits
existem
0.87
berper
0.85
trái
0.85
têm
0.83
않는다
0.82
ள்ளது
0.81
는다
0.81
한다
0.81
있습니다
0.79
৯
0.79
POSITIVE LOGITS
ness
1.06
くて
1.06
く
1.00
eness
0.96
さを
0.95
set
0.95
な
0.94
하게
0.92
Telemetry
0.91
한
0.89
Activations Density 0.063%