-p : 서버에서 데이터를 주고받는데 사용할 포트(Port) 번호를 지정한다. (위 예시의 경우, 8501 포트)
-v : 불러올 모델이 SavedModel 포맷으로 저장된 전체 경로(full path)를 의미한다. (위 예시의 경우, home/solaris/Desktop/tf_serving/saved_model 경로에서 모델 파일을 불러온다.),뒤에는 모델을 실행할 REST API URL을 의미한다. (위 예시의 경우, models/fashion_model)
TensorFlow Serving 서버가 잘 실행되었다면 아래와 같이 8501 포트로 REST API 서버가 구성되었다는 로그를 볼 수 있다.
2020-05-3010:38:09.443951:I tensorflow_serving/model_servers/server.cc:86]BuildingsingleTensorFlowmodel file config:model_name:fashion_model model_base_path:/models/fashion_model2020-05-3010:38:09.444251:I tensorflow_serving/model_servers/server_core.cc:462]Adding/updating models.2020-05-3010:38:09.444283:I tensorflow_serving/model_servers/server_core.cc:573](Re-)adding model:fashion_model2020-05-3010:38:09.544965:I tensorflow_serving/core/basic_manager.cc:739]Successfullyreserved resources to load servable{name:fashion_model version:1}2020-05-3010:38:09.545018:I tensorflow_serving/core/loader_harness.cc:66]Approvingloadforservable version{name:fashion_model version:1}2020-05-3010:38:09.545046:I tensorflow_serving/core/loader_harness.cc:74]Loadingservable version{name:fashion_model version:1}2020-05-3010:38:09.545085:I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31]ReadingSavedModelfrom:/models/fashion_model/12020-05-3010:38:09.549158:I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54]Readingmeta graphwithtags{serve}2020-05-3010:38:09.549198:I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:264]ReadingSavedModeldebug info(ifpresent)from:/models/fashion_model/12020-05-3010:38:09.549395:I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142]YourCPU supports instructions thatthisTensorFlowbinary wasnotcompiled touse:AVX2 AVX512F FMA2020-05-3010:38:09.585414:I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:203]RestoringSavedModelbundle.2020-05-3010:38:09.607762:I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:152]Runninginitialization op onSavedModelbundle at path:/models/fashion_model/12020-05-3010:38:09.613230:I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:333]SavedModelloadfortags{serve};Status:success:OK.Took68142microseconds.2020-05-3010:38:09.613782:I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:105]Nowarmup data file found at/models/fashion_model/1/assets.extra/tf_serving_warmup_requests2020-05-3010:38:09.614102:I tensorflow_serving/core/loader_harness.cc:87]Successfullyloaded servable version{name:fashion_model version:1}2020-05-3010:38:09.619644:I tensorflow_serving/model_servers/server.cc:358]RunninggRPCModelServerat0.0.0.0:8500...[warn]getaddrinfo:address familyfornodenamenotsupported2020-05-3010:38:09.624274:I tensorflow_serving/model_servers/server.cc:378]ExportingHTTP/REST API at:localhost:8501...[evhttp_server.cc:238]NET_LOG:Enteringtheeventloop...
POST Request를 통한 이미지 데이터 전송 및 예측 결과 시각화
이제 TensorFlow Serving을 실행할때 지정한 아래 URL(models/fashion_model)로 인풋 데이터를 전송한 후, 해당 인풋 데이터에 대한 예측 결과값을 반환받을 수 있다.
실행된 서버에 POST Request로 Fashion MNIST 테스트 이미지를 3개 전송하고, API 서버로부터 반환받은 예측결과를 시각화해보는 예제 코드는 아래와 같다.
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
import random
import json
import requests
def show(idx, title):
plt.figure(figsize=(12, 3))
plt.imshow(test_images[idx].reshape(28,28))
plt.axis('off')
plt.title('\n\n{}'.format(title), fontdict={'size': 16})
plt.show()
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# scale the values to 0.0 to 1.0
test_images = test_images / 255.0
# reshape for feeding into the model
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))
rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))
data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
# send data using POST request and receive prediction result
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
# show first prediction result
show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
class_names[np.argmax(predictions[0])], np.argmax(predictions[0]), class_names[test_labels[0]], test_labels[0]))
# set model version and send data using POST request and receive prediction result
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
# show all prediction result
for i in range(0,3):
show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
class_names[np.argmax(predictions[i])], np.argmax(predictions[i]), class_names[test_labels[i]], test_labels[i]))
TensorFlow Serving을 이용할 경우 위와 같이 간단하게 모델에 대한 REST API 형태의 추론 서버를 만들 수 있다.