最近使用了阿里云的OCR文字识别API
先来看看效果
我使用的是通用类文字识别,具体实现过程如下:
1.购买阿里云的通用类文字识别
目前是0元免费的,可以使用500次。购买成功后到->控制台->云市场查看购买的API,复制它的APPCODE码。
2.根据官方给出的API文档提交请求
我使用的Retrofit提交网络请求,定义如下的接口:
interface AliService{
@POST("/api/predict/ocr_general")
Call<HttpResult> getText(@Body RequestBody body,@Header("Authorization") String authorization);
}
根据官方提供的返回json实例,自定义一个HTTPResult类用于接收数据,记得添加Getter and Setter方法和构造方法:
public class HttpResult{
private String request_id;
private List<Bean> ret;
private boolean success;
}
class Bean{
private Rect rect;
private String word;
class Rect{
private float angle;
private float height;
private float left;
private float top;
private float width;
}
}
由于图片是bitmap格式的,我们必须要将图片进行base64编码后进行请求。
public static String bitmapToBase64(Bitmap bitmap) {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
bitmap.compress(Bitmap.CompressFormat.JPEG, 40, bos);//参数100表示不压缩
byte[] bytes = bos.toByteArray();
//转换来的base64码不需要加前缀,必须是NO_WRAP参数,表示没有空格。
return Base64.encodeToString(bytes, Base64.NO_WRAP);
//转换来的base64码需要需要加前缀,必须是NO_WRAP参数,表示没有空格。
//return "data:image/jpeg;base64," + Base64.encodeToString(bytes, Base64.NO_WRAP);
}
根据官方文档里的请求参数,构建出请求体:
Retrofit retrofit = new Retrofit.Builder()
.baseUrl("https://tysbgpu.market.alicloudapi.com")
.addConverterFactory(GsonConverterFactory.create())
.build();
AliService aliService = retrofit.create(AliService.class);
String body = "{\"image\":\""+bitmapToBase64(bitmap)+"\"," +
"\"configure\":{\"min_size\":16,\"output_prob\":false,\"output_keypoints\":false,\"skip_detection\":false,\"without_predicting_direction\":false}}";
RequestBody requestBody = RequestBody.create(okhttp3.MediaType.parse("application/json;charset=UTF-8"), body);
Call<HttpResult> call = aliService.getText(requestBody, "APPCODE " + APPCODE);
call.enqueue(new Callback<HttpResult>() {
@Override
public void onResponse(Call<HttpResult> call, Response<HttpResult> response) {
//根据返回的json解析出来并更新UI
if (response.body().getRet()!= null){
List<Bean> beans = response.body().getRet();
for (Bean bean : beans)
text += bean.getWord()+"\n";
activity.runOnUiThread(new Runnable() {
@Override
public void run() {
textView.setText(text);
}
});
}
}
@Override
public void onFailure(Call<HttpResult> call, Throwable t) {
Log.e(TAG, "onFailure: "+t.getMessage());
}
});
以上,就是调用阿里云OCR接口的核心代码了。如果你还不清楚如何调用相机拍照并返回图片的话,继续往下看。
3.Android调用相机拍照并返回图片
① 在清单文件AndroidManifest里面申请权限。
<uses-permission android:name="android.permission.INTERNET"/>
<uses-permission android:name="android.permission.CAMERA"/>
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
在application中声明FileProvide:
<provider
android:authorities="com.briana.aliocr.provider"//自己的包名
android:name="androidx.core.content.FileProvider"
android:exported="false"
android:grantUriPermissions="true">
<meta-data
android:name="android.support.FILE_PROVIDER_PATHS"
android:resource="@xml/file_paths" />
</provider>
新建一个xml,命名为file_paths.xml。
<?xml version="1.0" encoding="utf-8"?>
<resources>
<paths>
<external-path
name="camera_photos"
path="." />
<!-- path设置为'.'时代表整个存储卡 Environment.getExternalStorageDirectory() + "/path/" -->
</paths>
</resources>
② 在MainActivity中修改如下:
在调用相机拍照前,判断是否拥有权限,没有权限,就去申请。
private static final int PERMISSIONS_REQUEST_CODE = 1;
private boolean hasPermission(){
if (ContextCompat.checkSelfPermission(this, Manifest.permission.WRITE_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED
|| ContextCompat.checkSelfPermission(this, Manifest.permission.READ_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED
|| ContextCompat.checkSelfPermission(this,Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.WRITE_EXTERNAL_STORAGE, Manifest.permission.READ_EXTERNAL_STORAGE, Manifest.permission.CAMERA}, PERMISSIONS_REQUEST_CODE);
return false;
}else {
return true;
}
}
重写onRequestPermissionsResult方法,查看请求权限结果是否被用户通过,如果通过,就调用takephoto()方法拍照。
@Override
public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults);
if (requestCode == PERMISSIONS_REQUEST_CODE) {
if (grantResults.length > 0) {
for (int grantResult : grantResults) {
if (grantResult == PackageManager.PERMISSION_DENIED) {
return;
}
}
takePhoto();
}
}
}
调用相机拍照,并将图片路径记录下来:
private static final int CAMERA_REQUEST_CODE = 2;
File mFile;
Uri imageUri;
private void takePhoto(){
if (!hasPermission()) {
return;
}
File path = new File(Environment.getExternalStorageDirectory(),"img");
mFile = new File(path,System.currentTimeMillis()+".jpg");
try {
if (!path.exists())
path.mkdir();
if (!mFile.exists())
mFile.createNewFile();
} catch (IOException e) {
e.printStackTrace();
}
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
String authority = getPackageName() + ".provider";
imageUri = FileProvider.getUriForFile(this, authority, mFile);
} else {
imageUri = Uri.fromFile(mFile);
}
Intent intent = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);
intent.putExtra(MediaStore.EXTRA_OUTPUT,imageUri);
startActivityForResult(intent,CAMERA_REQUEST_CODE);
}
重写onActivityResult方法,根据路径取得图片,显示在imageView上,再调用阿里的接口进行图片文字识别。
@Override
protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
super.onActivityResult(requestCode, resultCode, data);
if (requestCode == CAMERA_REQUEST_CODE) {
Bitmap photo = BitmapFactory.decodeFile(mFile.getAbsolutePath());
imageView.setImageBitmap(photo);
AliOcr aliOcr = new AliOcr();
aliOcr.getText(this,photo);
}
}
给按钮添加点击事件监听,点击拍照:
button.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
takePhoto();
}
});
如果对你有帮助的话,给个赞吧~