KaraKaraWitch/StackedBlenderCartel-llama33-120B

This is a merge of pre-trained language models created using mergekitty.

But... WHY?!

... A stupid idea crept into my mind when I saw that I have 2 70B models of pretty good (:TM:) quality.

You could make a stacked model out of this.

So... here we are.

I know, contrary to what people have told us countless times not to do it, here we are once again with a stupid idea.

Merge Details

Merge Method

This model was merged using the passthrough merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:


slices:
  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt1
      layer_range: [0, 16]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt2
      layer_range: [8, 24]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt1
      layer_range: [17, 32]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt2
      layer_range: [25, 40]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt1
      layer_range: [33, 48]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt2
      layer_range: [41, 56]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt1
      layer_range: [49, 64]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt2
      layer_range: [57, 72]

  - sources:
    - model: KaraKaraWitch/BlenderCartel-llama33-70B-Pt1
      layer_range: [65, 80]

merge_method: passthrough
dtype: float16

Downloads last month: 14

Safetensors

Model size

119B params

Tensor type

F16

Model tree for KaraKaraWitch/StackedBlenderCartel-llama33-120B

KaraKaraWitch/BlenderCartel-llama33-70B-Pt1

KaraKaraWitch/BlenderCartel-llama33-70B-Pt2

Merge model

this model

Quantizations

2 models