视觉学习器

构建适合计算机视觉迁移学习的 Learner所需的所有函数

此模块最重要的函数是 vision_learner 和 unet_learner。它们将帮助您使用预训练模型定义一个 Learner。请参阅视觉教程以获取使用示例。

截断预训练模型

默认情况下，fastai 库会在池化层处截断预训练模型。此函数有助于检测该层。

has_pool_type

 has_pool_type (m)

如果 m 是池化层或其子层中包含池化层，则返回 True

m = nn.Sequential(nn.AdaptiveAvgPool2d(5), nn.Linear(2,3), nn.Conv2d(2,3,1), nn.MaxPool3d(5))
assert has_pool_type(m)
test_eq([has_pool_type(m_) for m_ in m.children()], [True,False,False,True])

源代码

cut_model

 cut_model (model, cut)

截断已实例化的模型

源代码

create_body

 create_body (model, n_in=3, pretrained=True, cut=None)

根据 cut 参数，截断通常为预训练的 arch 的主体

cut 可以是一个整数，在这种情况下，我们会在相应的层截断模型；或者是一个函数，在这种情况下，此函数返回 cut(model)。如果未指定，它默认为包含池化层的第一个层。

def tst(): return nn.Sequential(nn.Conv2d(3,5,3), nn.BatchNorm2d(5), nn.AvgPool2d(1), nn.Linear(3,4))
m = create_body(tst())
test_eq(len(m), 2)

m = create_body(tst(), cut=3)
test_eq(len(m), 3)

m = create_body(tst(), cut=noop)
test_eq(len(m), 4)

for n in range(1,5):    
    m = create_body(tst(), n_in=n)
    test_eq(_get_first_layer(m)[0].in_channels, n)

头部和模型

源代码

create_head

 create_head (nf, n_out, lin_ftrs=None, ps=0.5, pool=True,
              concat_pool=True, first_bn=True, bn_final=False,
              lin_first=False, y_range=None)

模型头部，接受 nf 特征，经过 lin_ftrs 处理，最终输出 n_out 类。

如果 concat_pool=True，头部以 fastai 的 AdaptiveConcatPool2d 开始，否则使用传统的平均池化。然后使用 Flatten 层，之后是 BatchNorm、Dropout 和 Linear 层的块（如果 lin_first=True，则顺序为 Linear、BatchNorm、Dropout）。

这些块从 nf 开始，然后经过 lin_ftrs 的每个元素（默认为 [512]），最后输出 n_out。ps 是用于 Dropout 的概率列表（如果只传递一个值，它将先使用该值的一半，然后根据需要多次使用该值）。

如果 first_bn=True，在池化操作后会立即添加一个 BatchNorm 层。如果 bn_final=True，会添加一个最终的 BatchNorm 层。如果传递了 y_range，函数将添加一个 SigmoidRange 到该范围。

tst = create_head(5, 10)
tst

Sequential(
  (0): AdaptiveConcatPool2d(
    (ap): AdaptiveAvgPool2d(output_size=1)
    (mp): AdaptiveMaxPool2d(output_size=1)
  )
  (1): fastai.layers.Flatten(full=False)
  (2): BatchNorm1d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (3): Dropout(p=0.25, inplace=False)
  (4): Linear(in_features=10, out_features=512, bias=False)
  (5): ReLU(inplace=True)
  (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (7): Dropout(p=0.5, inplace=False)
  (8): Linear(in_features=512, out_features=10, bias=False)
)

#TODO: refactor, i.e. something like this?
# class ModelSplitter():
#     def __init__(self, idx): self.idx = idx
#     def split(self, m): return L(m[:self.idx], m[self.idx:]).map(params)
#     def __call__(self,): return {'cut':self.idx, 'split':self.split}

源代码

默认分割

 default_split (m)

模型主体与头部之间的默认分割

要进行迁移学习，您需要将一个 splitter 传递给 Learner。这应该是一个函数，接受模型并返回参数组的集合，例如参数列表的列表。

源代码

添加头部

 add_head (body, nf, n_out, init=<function kaiming_normal_>, head=None,
           concat_pool=True, pool=True, lin_ftrs=None, ps=0.5,
           first_bn=True, bn_final=False, lin_first=False, y_range=None)

为视觉主体添加头部

源代码

创建视觉模型

 create_vision_model (arch, n_out, pretrained=True, weights=None,
                      cut=None, n_in=3, init=<function kaiming_normal_>,
                      custom_head=None, concat_pool=True, pool=True,
                      lin_ftrs=None, ps=0.5, first_bn=True,
                      bn_final=False, lin_first=False, y_range=None)

创建自定义视觉架构

模型根据 cut 进行截断，并且可以是 pretrained 的，在这种情况下，会下载并加载相应的权重集。init 应用于模型的头部，该头部由 create_head 创建（使用 lin_ftrs、ps、concat_pool、bn_final、lin_first 和 y_range），或者由 custom_head 指定。

tst = create_vision_model(models.resnet18, 10, True)
tst = create_vision_model(models.resnet18, 10, True, n_in=1)

源代码

TimmBody

 TimmBody (model, pretrained:bool=True, cut=None, n_in:int=3)

*所有神经网络模块的基类。

您的模型也应该继承此类。

模块也可以包含其他模块，允许它们以树状结构嵌套。您可以将子模块作为常规属性分配：

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

以这种方式分配的子模块将被注册，并且在调用 :meth:to 等方法时，它们的参数也会被转换。

.. 注意:: 如上例所示，在子类中进行赋值之前，必须调用父类的 __init__() 方法。

:ivar training: 布尔值，表示此模块处于训练模式还是评估模式。 :vartype training: bool*

源代码

创建 Timm 模型

 create_timm_model (arch, n_out, cut=None, pretrained=True, n_in=3,
                    init=<function kaiming_normal_>, custom_head=None,
                    concat_pool=True, pool=True, lin_ftrs=None, ps=0.5,
                    first_bn=True, bn_final=False, lin_first=False,
                    y_range=None, **kwargs)

使用 timm 库中的 arch、n_in 和 n_out 创建自定义架构

# make sure that timm models can be scripted:
tst, _ = create_timm_model('resnet34', 1)
scripted = torch.jit.script(tst)
assert scripted, "model could not be converted to TorchScript"

`Learner` 便捷函数

源代码

vision_learner

 vision_learner (dls, arch, normalize=True, n_out=None, pretrained=True,
                 weights=None, loss_func=None, opt_func=<function Adam>,
                 lr=0.001, splitter=None, cbs=None, metrics=None,
                 path=None, model_dir='models', wd=None, wd_bn_bias=False,
                 train_bn=True, moms=(0.95, 0.85, 0.95), cut=None,
                 init=<function kaiming_normal_>, custom_head=None,
                 concat_pool=True, pool=True, lin_ftrs=None, ps=0.5,
                 first_bn=True, bn_final=False, lin_first=False,
                 y_range=None, n_in=3)

从 dls 和 arch 构建视觉学习器

	类型	默认值	详情
dls
arch
normalize	布尔值	True
n_out	NoneType	None
pretrained	布尔值	True
weights	NoneType	None
loss_func	NoneType	None
opt_func	函数	Adam
lr	浮点数	0.001
splitter	NoneType	None
cbs	NoneType	None
metrics	NoneType	None
path	NoneType	None	Learner 参数
model_dir	字符串	models
wd	NoneType	None
wd_bn_bias	布尔值	False
train_bn	布尔值	True
moms	元组	(0.95, 0.85, 0.95)
cut	NoneType	None
init	函数	kaiming_normal_
custom_head	NoneType	None
concat_pool	布尔值	True
pool	布尔值	True	模型和头部参数
lin_ftrs	NoneType	None
ps	浮点数	0.5
first_bn	布尔值	True
bn_final	布尔值	False
lin_first	布尔值	False
y_range	NoneType	None
n_in	整数	3

模型使用 arch 构建，最终激活数量尽可能从 dls 推断（否则将一个值传递给 n_out）。模型可能是 pretrained 的，并且使用模型架构的默认元数据进行截断和分割（这可以通过传递 cut 或 splitter 进行自定义）。

如果 normalize 和 pretrained 都为 True，此函数会将一个 Normalization 变换（如果尚不存在）添加到 dls，使用预训练模型的统计数据。这样，您在迁移学习中永远不会忘记标准化数据。

所有其他参数都传递给 Learner。

从版本 0.13 开始，TorchVision 支持同一模型架构的多种预训练权重。vision_learner 的默认设置 pretrained=True, weights=None 将使用该架构的默认权重，目前是 IMAGENET1K_V2。如果您使用的是旧版本的 TorchVision 或创建 timm 模型，设置 weights 将无效。

from torchvision.models import ResNet50_Weights

# Legacy weights with accuracy 76.130%
vision_learner(models.resnet50, pretrained=True, weights=ResNet50_Weights.IMAGENET1K_V1, ...)

# New weights with accuracy 80.858%. Strings are also supported.
vision_learner(models.resnet50, pretrained=True, weights='IMAGENET1K_V2', ...)

# Best available weights (currently an alias for IMAGENET1K_V2).
# Default weights if vision_learner weights isn't set.
vision_learner(models.resnet50, pretrained=True, weights=ResNet50_Weights.DEFAULT, ...)

# No weights - random initialization
vision_learner(models.resnet50, pretrained=False, weights=None, ...)

上面的示例展示了如何将新的 TorchVision 0.13 多权重 API 与 vision_learner 一起使用。

path = untar_data(URLs.PETS)
fnames = get_image_files(path/"images")
pat = r'^(.*)_\d+.jpg$'
dls = ImageDataLoaders.from_name_re(path, fnames, pat, item_tfms=Resize(224))

learn = vision_learner(dls, models.resnet18, loss_func=CrossEntropyLossFlat(), ps=0.25)

如果您将一个 str 传递给 arch，则会创建一个 timm 模型

dls = ImageDataLoaders.from_name_re(path, fnames, pat, item_tfms=Resize(224))
learn = vision_learner(dls, 'convnext_tiny', loss_func=CrossEntropyLossFlat(), ps=0.25)

源代码

创建 UNet 模型

 create_unet_model (arch, n_out, img_size, pretrained=True, weights=None,
                    cut=None, n_in=3, blur=False, blur_final=True,
                    self_attention=False, y_range=None, last_cross=True,
                    bottle=False, act_cls=<class
                    'torch.nn.modules.activation.ReLU'>, init=<function
                    kaiming_normal_>, norm_type=None)

创建自定义 UNet 架构

tst = create_unet_model(models.resnet18, 10, (24,24), True, n_in=1)

源代码

unet_learner

 unet_learner (dls, arch, normalize=True, n_out=None, pretrained=True,
               weights=None, config=None, loss_func=None,
               opt_func=<function Adam>, lr=0.001, splitter=None,
               cbs=None, metrics=None, path=None, model_dir='models',
               wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85,
               0.95), cut=None, n_in=3, blur=False, blur_final=True,
               self_attention=False, y_range=None, last_cross=True,
               bottle=False, act_cls=<class
               'torch.nn.modules.activation.ReLU'>, init=<function
               kaiming_normal_>, norm_type=None)

从 dls 和 arch 构建 UNet 学习器

	类型	默认值	详情
dls
arch
normalize	布尔值	True
n_out	NoneType	None
pretrained	布尔值	True
weights	NoneType	None
config	NoneType	None
loss_func	NoneType	None
opt_func	函数	Adam
lr	浮点数	0.001
splitter	NoneType	None
cbs	NoneType	None
metrics	NoneType	None
path	NoneType	None	Learner 参数
model_dir	字符串	models
wd	NoneType	None
wd_bn_bias	布尔值	False
train_bn	布尔值	True
moms	元组	(0.95, 0.85, 0.95)
cut	NoneType	None
n_in	整数	3
blur	布尔值	False
blur_final	布尔值	True
self_attention	布尔值	False
y_range	NoneType	None
last_cross	布尔值	True
bottle	布尔值	False
act_cls	类型	ReLU
init	函数	kaiming_normal_
norm_type	NoneType	None

模型使用 arch 构建，最终滤波器数量尽可能从 dls 推断（否则将一个值传递给 n_out）。模型可能是 pretrained 的，并且使用模型架构的默认元数据进行截断和分割（这可以通过传递 cut 或 splitter 进行自定义）。