INDEX
Explanations
words related to legal terminology and proceedings
New Auto-Interp
Head Attr Weights
0:0.05
1:0.08
2:0.15
3:0.03
4:0.02
5:0.04
6:0.05
7:0.03
8:0.02
9:0.03
10:0.41
11:0.05
Negative Logits
Felix
-2.83
Fel
-2.63
Mutual
-2.55
fitt
-2.40
slic
-2.40
Fut
-2.39
Finnish
-2.37
Fen
-2.34
FW
-2.31
Fel
-2.28
POSITIVE LOGITS
ram
4.93
RAM
4.25
rams
4.13
Ram
3.91
Ram
3.45
ram
3.25
rame
3.16
rama
3.13
rak
3.10
Rah
3.05
Activations Density 0.001%