Upload large files in Java

CodeLee0106 2021-01-14 14:26:22
upload large files java


java Zhongda file upload

1、 What is second transmission
Popular said , You upload what you want to upload , The server will do it first MD5 check , If there's one thing on the server , It just gives you a new address , In fact, what you download is the same file on the server , If you want to pass without second , In fact, just let MD5 change , It's just a change to the file itself ( You can't change your name ), For example, a text file , You add a few more words ,MD5 It's changed , It won't be a second .

2、 The core logic of second transmission realized in this paper
2.1、 utilize redis Of set Method to store the file upload status , among key For file upload md5,value The flag bit for whether the upload is completed ,

2.2、 When flag bit true The upload has been completed , At this time, if the same file is uploaded , Then enter the second transfer logic . If the flag bit is false, It means that the upload has not been completed , At this point, you need to call set Methods , The path to save the block number file record , among key For uploading files md5 Add a fixed prefix ,value Record the path for the block number file
Patch uploading
1. What is fragment upload
Patch uploading , Is to upload the file , According to a certain size , Separate the entire file into blocks ( We call it Part) To upload separately , After uploading, the server will summarize all the uploaded files and integrate them into the original files .

2. Scenes uploaded in pieces
1. Big file upload

2. The network environment is not good , There are scenarios that require retransmission risk

Breakpoint continuation
1、 What is breakpoint continuation
Breakpoint continuation is when downloading or uploading , Will download or upload tasks ( A file or a compressed package ) Artificially divided into several parts , Each part uses a thread to upload or download , If you run into a network failure , You can start from the part that has been uploaded or downloaded, continue to upload or download the unfinished part , There's no need to upload or download from scratch . This article's breakpoint continuation is mainly aimed at the breakpoint upload scenario .

2、 Application scenarios
Breakpoint continuation can be regarded as a derivative of piecewise upload , So you can use the scene of fragment upload , You can use breakpoints to continue transmission .

3、 The core logic of breakpoint continuation
In the process of fragment upload , If the upload is interrupted due to abnormal factors such as system crash or network interruption , At this time, the client needs to record the upload progress . When it supports uploading again later , You can continue to upload from the place where the last upload was interrupted .

In order to avoid the problem that the client's progress data is deleted after uploading, which leads to the restart of uploading from the beginning , The server can also provide the corresponding interface for the client to query the uploaded fragment data , So that the client knows the uploaded fragment data , So we can continue to upload from the next piece of data .

4、 Implementation process steps
a、 Scheme 1 , General steps

Will need to upload the file according to certain segmentation rules , Split into blocks of the same size ;
Initialize a fragment upload task , Return to the unique identification of this fragment upload ;
Follow a certain strategy ( Serial or parallel ) Send each piece of data block ;
After sending , The server judges whether the data upload is complete according to the data , If it's complete , Then the data block is synthesized to get the original file .
b、 Option two 、 The steps of this paper are as follows

front end ( client ) You need to slice files according to a fixed size , Request backend ( Server side ) Take the slice number and size with you
Server creation conf The file is used to record the location of the block ,conf The file length is the total number of partitions , Every block uploaded is sent to conf Write a 127, So the location that didn't upload is the default 0, What has been uploaded is Byte.MAX_VALUE 127( This step is the core step to achieve breakpoint continuation and second transfer )
According to the partition number given in the request data and the partition size of each partition, the server ( The slice size is fixed and the same ) Calculate the starting position , With the read file fragment data , write file .
5、 Patch uploading / Breakpoint upload code implementation
a、 The front end is provided by Baidu webuploader Plug in for , Slice . Because this article mainly introduces the server code implementation ,webuploader How to slice , See the following link for specific implementation :

b、 The back end uses two ways to write files , One is to use RandomAccessFile, If the RandomAccessFile Unfamiliar friends , You can view the following links :

The other is to use MappedByteBuffer, Yes MappedByteBuffer Unfamiliar friends , You can see the following links to learn :

The core code for writing operations in the back end
a、RandomAccessFile Realization way

@UploadMode(mode = UploadModeEnum.RANDOM_ACCESS)
@Slf4j
public class RandomAccessUploadStrategy extends SliceUploadTemplate {
@Autowired
private FilePathUtil filePathUtil;
@Value("${upload.chunkSize}")
private long defaultChunkSize;
@Override
public boolean upload(FileUploadRequestDTO param) {
RandomAccessFile accessTmpFile = null;
try {
String uploadDirPath = filePathUtil.getPath(param);
File tmpFile = super.createTmpFile(param);
accessTmpFile = new RandomAccessFile(tmpFile, "rw");
// This must be consistent with the value set by the front end
long chunkSize = Objects.isNull(param.getChunkSize()) ? defaultChunkSize * 1024 * 1024
: param.getChunkSize();
long offset = chunkSize * param.getChunk();
// Locate the offset to the slice
accessTmpFile.seek(offset);
// Write the slice data
accessTmpFile.write(param.getFile().getBytes());
boolean isOk = super.checkAndSetUploadProgress(param, uploadDirPath);
return isOk;
} catch (IOException e) {
log.error(e.getMessage(), e);
} finally {
FileUtil.close(accessTmpFile);
}
return false;
}
}

b、MappedByteBuffer Realization way

@UploadMode(mode = UploadModeEnum.MAPPED_BYTEBUFFER)
@Slf4j
public class MappedByteBufferUploadStrategy extends SliceUploadTemplate {
@Autowired
private FilePathUtil filePathUtil;
@Value("${upload.chunkSize}")
private long defaultChunkSize;
@Override
public boolean upload(FileUploadRequestDTO param) {
RandomAccessFile tempRaf = null;
FileChannel fileChannel = null;
MappedByteBuffer mappedByteBuffer = null;
try {
String uploadDirPath = filePathUtil.getPath(param);
File tmpFile = super.createTmpFile(param);
tempRaf = new RandomAccessFile(tmpFile, "rw");
fileChannel = tempRaf.getChannel();
long chunkSize = Objects.isNull(param.getChunkSize()) ? defaultChunkSize * 1024 * 1024
: param.getChunkSize();
// Write the slice data
long offset = chunkSize * param.getChunk();
byte[] fileData = param.getFile().getBytes();
mappedByteBuffer = fileChannel
.map(FileChannel.MapMode.READ_WRITE, offset, fileData.length);
mappedByteBuffer.put(fileData);
boolean isOk = super.checkAndSetUploadProgress(param, uploadDirPath);
return isOk;
} catch (IOException e) {
log.error(e.getMessage(), e);
} finally {
FileUtil.freedMappedByteBuffer(mappedByteBuffer);
FileUtil.close(fileChannel);
FileUtil.close(tempRaf);
}
return false;
}
}

c、 File operation core template class code

@Slf4j
public abstract class SliceUploadTemplate implements SliceUploadStrategy {
public abstract boolean upload(FileUploadRequestDTO param);
protected File createTmpFile(FileUploadRequestDTO param) {
FilePathUtil filePathUtil = SpringContextHolder.getBean(FilePathUtil.class);
param.setPath(FileUtil.withoutHeadAndTailDiagonal(param.getPath()));
String fileName = param.getFile().getOriginalFilename();
String uploadDirPath = filePathUtil.getPath(param);
String tempFileName = fileName + "_tmp";
File tmpDir = new File(uploadDirPath);
File tmpFile = new File(uploadDirPath, tempFileName);
if (!tmpDir.exists()) {
tmpDir.mkdirs();
}
return tmpFile;
}
@Override
public FileUploadDTO sliceUpload(FileUploadRequestDTO param) {
boolean isOk = this.upload(param);
if (isOk) {
File tmpFile = this.createTmpFile(param);
FileUploadDTO fileUploadDTO = this.saveAndFileUploadDTO(param.getFile().getOriginalFilename(), tmpFile);
return fileUploadDTO;
}
String md5 = FileMD5Util.getFileMD5(param.getFile());
Map<Integer, String> map = new HashMap<>();
map.put(param.getChunk(), md5);
return FileUploadDTO.builder().chunkMd5Info(map).build();
}
/**
* Check and modify file upload progress
*/
public boolean checkAndSetUploadProgress(FileUploadRequestDTO param, String uploadDirPath) {
String fileName = param.getFile().getOriginalFilename();
File confFile = new File(uploadDirPath, fileName + ".conf");
byte isComplete = 0;
RandomAccessFile accessConfFile = null;
try {
accessConfFile = new RandomAccessFile(confFile, "rw");
// Mark the segment as true Express completion
System.out.println("set part " + param.getChunk() + " complete");
// establish conf The file length is the total number of partitions , Every block uploaded is sent to conf Write a 127, Then the location that has not been uploaded is the default 0, What has been uploaded is Byte.MAX_VALUE 127
accessConfFile.setLength(param.getChunks());
accessConfFile.seek(param.getChunk());
accessConfFile.write(Byte.MAX_VALUE);
//completeList Check that it's all done , If all of the arrays are 127( All the pieces have been uploaded successfully )
byte[] completeList = FileUtils.readFileToByteArray(confFile);
isComplete = Byte.MAX_VALUE;
for (int i = 0; i < completeList.length && isComplete == Byte.MAX_VALUE; i++) {
// And operation , If some parts are not completed, then isComplete No Byte.MAX_VALUE
isComplete = (byte) (isComplete & completeList[i]);
System.out.println("check part " + i + " complete?:" + completeList[i]);
}
} catch (IOException e) {
log.error(e.getMessage(), e);
} finally {
FileUtil.close(accessConfFile);
}
boolean isOk = setUploadProgress2Redis(param, uploadDirPath, fileName, confFile, isComplete);
return isOk;
}
/**
* Save the upload progress information into redis
*/
private boolean setUploadProgress2Redis(FileUploadRequestDTO param, String uploadDirPath,
String fileName, File confFile, byte isComplete) {
RedisUtil redisUtil = SpringContextHolder.getBean(RedisUtil.class);
if (isComplete == Byte.MAX_VALUE) {
redisUtil.hset(FileConstant.FILE_UPLOAD_STATUS, param.getMd5(), "true");
redisUtil.del(FileConstant.FILE_MD5_KEY + param.getMd5());
confFile.delete();
return true;
} else {
if (!redisUtil.hHasKey(FileConstant.FILE_UPLOAD_STATUS, param.getMd5())) {
redisUtil.hset(FileConstant.FILE_UPLOAD_STATUS, param.getMd5(), "false");
redisUtil.set(FileConstant.FILE_MD5_KEY + param.getMd5(),
uploadDirPath + FileConstant.FILE_SEPARATORCHAR + fileName + ".conf");
}
return false;
}
}
/**
* Save file operation
*/
public FileUploadDTO saveAndFileUploadDTO(String fileName, File tmpFile) {
FileUploadDTO fileUploadDTO = null;
try {
fileUploadDTO = renameFile(tmpFile, fileName);
if (fileUploadDTO.isUploadComplete()) {
System.out
.println("upload complete !!" + fileUploadDTO.isUploadComplete() + " name=" + fileName);
//TODO Save file information to database
}
} catch (Exception e) {
log.error(e.getMessage(), e);
} finally {
}
return fileUploadDTO;
}
/**
* File rename
*
* @param toBeRenamed The file whose name will be changed
* @param toFileNewName New name
*/
private FileUploadDTO renameFile(File toBeRenamed, String toFileNewName) {
// Check if the file to be renamed exists , Is it a document
FileUploadDTO fileUploadDTO = new FileUploadDTO();
if (!toBeRenamed.exists() || toBeRenamed.isDirectory()) {
log.info("File does not exist: {}", toBeRenamed.getName());
fileUploadDTO.setUploadComplete(false);
return fileUploadDTO;
}
String ext = FileUtil.getExtension(toFileNewName);
String p = toBeRenamed.getParent();
String filePath = p + FileConstant.FILE_SEPARATORCHAR + toFileNewName;
File newFile = new File(filePath);
// Change file name
boolean uploadFlag = toBeRenamed.renameTo(newFile);
fileUploadDTO.setMtime(DateUtil.getCurrentTimeStamp());
fileUploadDTO.setUploadComplete(uploadFlag);
fileUploadDTO.setPath(filePath);
fileUploadDTO.setSize(newFile.length());
fileUploadDTO.setFileExt(ext);
fileUploadDTO.setFileId(toFileNewName);
return fileUploadDTO;
}
}

In the process of realizing fragment upload , Need front end and back end to work together , For example, the file size of the upload block number at the front and back end , The front and back ends have to be the same , Otherwise, there will be problems in uploading . Secondly, file related operations are normal, we need to build a file server , For example, use fastdfs、hdfs etc. .

This sample code is configured as 4 Core memory 8G Under the circumstances , Upload 24G Size file , Upload time needs 30 More minutes , The main time spent on the front end md5 Value calculation , The speed of back-end writing is still relatively fast . If the project team thinks it takes too much time to build a self built file server , And the project needs only upload and download , So it's recommended to use Ali's oss The server , Its introduction can be found on the official website :

Ali's oss It's essentially an object storage server , Not file servers , So if there is a need to delete or modify a large number of files ,oss It may not be a good choice .

At the end of the paper, a oss Link to form upload demo, adopt oss Form upload , You can upload files directly from the front end to oss The server , Put all the pressure on oss The server :

版权声明
本文为[CodeLee0106]所创,转载请带上原文链接,感谢
https://javamana.com/2021/01/20210114125834036P.html

  1. Redis basic command
  2. Summary of MySQL articles
  3. 2、 Create k8s cluster in 5 seconds
  4. data自定义属性在jQuery中的用法
  5. Linux常见解压缩
  6. Detailed explanation of HBase basic principle
  7. 1、 Why and how to learn k8s
  8. Java advanced (29) -- HashMap set
  9. java中大文件上传
  10. Weblogic 2017-3248 analysis of Java Security
  11. Kubernetes official java client 8: fluent style
  12. Explain the function of thread pool and how to use it in Java
  13. Programming software tutorial video Encyclopedia: C + + / Java / Python / assembly / easy language (with tutorial)
  14. Description of dependency problem after javacv is updated to 1.5. X and how to reduce the size of dependency package
  15. Java reflection & dynamic agent
  16. Building Apache 2.4 + php7 + mysql8 in centos7 environment
  17. Summary of Java multithreading (1)
  18. Oracle AWR report generation
  19. Four magic functions of mybatis, don't step on the pit!
  20. A 16-year-old high school student successfully transplanted Linux to iPhone and posted a detailed guide
  21. Centos7 one click installation of JDK1.8 shell script
  22. Mounting of file system in Linux (centos7)
  23. How does serverless deal with the resource supply demand of k8s in the offline scenario
  24. Detailed explanation of HBase basic principle
  25. Spring security oauth2.0 authentication and authorization 4: distributed system authentication and authorization
  26. Redis performance Part 5 redis buffer
  27. JavaScript this keyword
  28. Summary of Java multithreading (3)
  29. Sentry(v20.12.1) K8S 云原生架构探索, SENTRY FOR JAVASCRIPT 手动捕获事件基本用法
  30. Sentry(v20.12.1) K8S 云原生架构探索, SENTRY FOR JAVASCRIPT 手动捕获事件基本用法
  31. (10) Spring from the beginning to the end
  32. Summary of Java multithreading (2)
  33. Spring source notes! From the introduction to the source code, let you really understand the source code
  34. A stormy sunny day
  35. Zookeeper (curator), the implementation of distributed lock
  36. Show the sky! Tencent T4's core Java Dictionary (framework + principle + Notes + map)
  37. Spring boot project, how to gracefully replace the blank value in the interface parameter with null value?
  38. Spring boot project, how to gracefully replace the blank value in the interface parameter with null value?
  39. docker+mysql集群+读写分离+mycat管理+垂直分库+负载均衡
  40. docker+mysql集群+读写分离+mycat管理+垂直分库+负载均衡
  41. To what extent can I go out to work?
  42. Java 使用拦截器无限转发/重定向无限循环/重定向次数过多报错(StackOverflowError) 解决方案
  43. Implementation of rocketmq message sending based on JMeter
  44. How to choose the ticket grabbing app in the Spring Festival? We have measured
  45. Implementation of rocketmq message sending based on JMeter
  46. My programmer's Road: self study java
  47. My programmer's Road: self study java
  48. All in one, one article talks about the use of virtual machine VirtualBox and Linux
  49. All in one, one article talks about the use of virtual machine VirtualBox and Linux
  50. Java 使用拦截器无限转发/重定向无限循环/重定向次数过多报错(StackOverflowError) 解决方案
  51. [Java training project] Java ID number recognition system
  52. How does serverless deal with the resource supply demand of k8s in the offline scenario
  53. Detailed explanation of HBase basic principle
  54. Explain the function of thread pool and how to use it in Java
  55. Kubernetes official java client 8: fluent style
  56. 010_MySQL
  57. Vibrant special purchases for the Spring Festival tiktok section, hundreds of good things to make the year more rich flavor.
  58. 010_MySQL
  59. Of the 4 million docker images, 51% have high-risk vulnerabilities
  60. Rocketmq CPP client visual studio 2019 compilation