公共数据

慧星云整理了一些常用的公开数据集,方便您在实例中使用。

使用方法​

1、登录方式
使用 ssh 命令登录工作区,例如:
ssh -p 14917 root@hz1.dc.houdeyun.cn
2、公共数据位置
公共盘挂载在 /root/public/ 目录,您可以通过运行 df 命令来检查目录是否已正确挂载。
3、使用方式
由于用户只具有对公共盘的只读权限,请确保将压缩文件解压到您本地后再使用。示例命令如下:
unzip /root/public/ModelNet/ModelNet10.zip -d /root/ModelNet10

公共数据

数据名称实例中路径大小类型发布方介绍
argoverse2.0感知数据集/root/public/datasets/argoverse2.0-sensor739.02 GiB数据集https://argoverse.github.iohttps://argoverse.github.io/user-guide/
Vimeo-90k/root/public/datasets/Vimeo-90k81.89 GiB数据集toflow.csail.mit.eduVimeo-90k视频超分数据集
CULane/root/public/datasets/CULane42.45 GiB数据集https://xingangpan.github.io/projects/CULane.htmlCULane is a large scale challenging dataset for academic research on traffic lane detection
TT100K/root/public/datasets/TT100K106.77 GiB数据集https://cg.cs.tsinghua.edu.cn/traffic-sign/交通信号灯检测与识别数据集
cifar-100/root/public/datasets/cifar-100161.17 MiB数据集https://www.cs.toronto.edu/~kriz/cifar.htmlCIFAR-100图像分类数据集
CUB200-2011/root/public/datasets/CUB200-20111.11 GiB数据集http://www.vision.caltech.edu/datasets/cub_200_2011/鸟类细粒度分类数据集
ModelNet/root/public/datasets/ModelNet2.34 GiB数据集https://modelnet.cs.princeton.edu/The goal of the Princeton ModelNet project is to provide researchers in computer vision, computer graphics, robotics and cognitive science, with a comprehensive clean collection of 3D CAD models for objects.
S3DIS/root/public/datasets/S3DIS14.26 GiB数据集http://buildingparser.stanford.edu/dataset.htmlStanford Large-Scale 3D Indoor Spaces Dataset (S3DIS)
Aishell/root/public/datasets/Aishell14.51 GiB数据集http://openslr.org/33/400 people from different accent areas in China are invited to participate in the recording, which is conducted in a quiet indoor environment using high fidelity microphone and downsampled to 16kHz.
CrowdHuman/root/public/datasets/CrowdHuman13.25 GiB数据集https://www.crowdhuman.org/The CrowdHuman dataset is large, rich-annotated and contains high diversity. CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively.
MsCelebV1/root/public/datasets/MS-Celeb-1M154.4 GiB数据集http://research.microsoft.com/en-US/projects/irc/acmmm2016.aspx微软名人数据集
DIV2K/root/public/datasets/DIV2K8.45 GiB数据集https://data.vision.ee.ethz.ch/cvl/DIV2K/DIVerse 2K resolution high quality images as used for the challenges @ NTIRE (CVPR 2017 and CVPR 2018) and @ PIRM (ECCV 2018)
nuScenes/root/public/datasets/nuScenes548.35 GiB数据集https://www.nuscenes.org/nuScenes is a public large-scale dataset for autonomous driving. It enables researchers to study challenging urban driving situations using the full sensor suite of a real self-driving car.
CelebA/root/public/datasets/CelebA21.69 GiB数据集http://mmlab.ie.cuhk.edu.hk/projects/CelebA.htmlCelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.
KITTI/root/public/datasets/KITTI132.99 GiB数据集https://www.cvlibs.net/datasets/kittiKITTI数据集
KITTI_Depth_Completion/root/public/datasets/KITTI/kitti_depth_completion92.7 GiB数据集http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_completionKITTI深度补全数据集
SemanticKITTI/root/public/datasets/SemanticKITTI82.83 GiB数据集http://www.semantic-kitti.org/dataset.html#downloadSemanticKITTI数据集
MPII Human Pose/root/public/datasets/mpii_human_pose11.27 GiB数据集http://human-pose.mpi-inf.mpg.de/#downloadMPII Human Pose数据集
MVTec AD/root/public/datasets/mvtec-ad4.9 GiB数据集https://www.mvtec.com/company/research/datasets/mvtec-ad工业异常检测的数据集
ImageNet100/root/public/datasets/ImageNet10013.41 GiB数据集image-net.orgImageNet 100类数据集。参考:https://github.com/HobbitLong/CMC/blob/master/imagenet100.txt
ImageNet/root/public/datasets/imagenet-1k157.56 GiB数据集image-net.orgImageNet 1000类分类识别数据集
SAIL-VOS/root/public/datasets/SAIL-VOS173.18 GiB数据集https://sailvos.web.illinois.edu/_site/dataset_readme.html语义非模态实例级视频对象分割数据集(内蒙A区有该数据集)
MOT17/root/public/datasets/mot175.46 GiB数据集https://motchallenge.net/data/MOT17/MOT17 Challenge
Cityscapes/root/public/datasets/cityscapes11.03 GiB数据集www.cityscapes-dataset.net城市街景实例/语义分割
GOT10k/root/public/datasets/GOT10k71.11 GiB数据集got-10k.aitestunion.com大型目标跟踪数据集
MOT20/root/public/datasets/mot204.7 GiB数据集motchallenge.net/data/MOT20/密集人群中行人跟踪数据集(多目标跟踪)
CASIAWebFace/root/public/datasets/CASIAWebFace4.1 GiB数据集www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html大规模人脸数据集,主要用于身份鉴定和人脸识别,包含 10,575 个主题和 494,414 张图像
DOTA v1/root/public/datasets/DOTA18.83 GiB数据集captain-whu.github.io/DOTA航拍图像物体检测数据集
ADEChallengeData2016/root/public/datasets/ADEChallengeData20161.1 GiB数据集sceneparsing.csail.mit.eduADE20K场景语义分割数据集
COCO 2017/root/public/datasets/coco201725.19 GiB数据集MicrosoftCOCO 2017检测数据集
CIFAR10/root/public/datasets/cifar-10163 MB数据集www.cs.toronto.eduCIFAR10 分类数据集
PASCAL VOC2012/root/public/datasets/voc20121.8 GiB数据集host.robots.ox.ac.ukVOC 2012检测和语义分割数据集
PASCAL VOC2007/root/public/datasets/voc2007837 MB数据集host.robots.ox.ac.ukVOC 2007检测和语义分割数据集
RoBERTa预训练模型(Torch)/root/public/models/RoBERTa-Pretrain-Model1.06 GiB模型参考:https://docs.qq.com/sheet/DVnpkTnF6VW9UeXdh?tab=BB08J2RoBERTa预训练模型
开源中英双语对话模型/root/public/models/chatglm2-6b11.63 GiB模型https://huggingface.co/THUDM/chatglm2-6bChatGLM2-6B 是开源中英双语对话模型 ChatGLM-6B 的第二代版本
2024-12-06